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Perceptions  of  comprehension  and  cognitive  readiness  are  salient  features  of 
academic  reading  and  studying  and  the  principal  determinants  of  the  learning  strategies 
readers  employ  and  cognitive  resources  they  expend.  Undetected  cognitive  failure  during 
reading  is  a  problem  well-documented  with  young  readers,  and  reseai-chers  have  recently 
established  that  even  adult,  skilled  readers  are  often  not  proficient  at  monitoring  then- 
cognition.  The  purpose  of  this  study  was  to  determine  whether  questions  embedded  in 
expository  text  could  improve  the  correspondence  between  adult  readers'  subjective 
assessments  of  test  readiness  and  their  objective  test  performance  (prediction  calibration). 
A  lesser  consideration  was  whether  embedded  questions  had  an  impact  on  postdiction 
judgments  of  performance  (postdiction  cahbration).  In  order  to  minunize  the  confounding 
effects  of  prior  knowledge,  subjects  were  asked  to  read  a  text  based  on  a  make-believe 
solar  system.  This  experiment  was  prepared  as  a  two-factor  design,  embedded  questions 
(yes/no)  and  text  reinspection  (yes/no).  Subjects  in  this  study  were  not  cued  to  process  the 

vi 


questions  in  any  fashion,  and  therefore  effects  discovered  were  learner-produced  rather 
than  investigator-induced.  The  purpose  of  the  lookback  factor  was  to  separate  the  effects 
of  embedded  questions  on  perceptions  of  cognitive  readiness  when  combined  with  re-study 
decisions  from  the  effects  of  embedded  questions  when  re-study  was  prohibited. 
Participating  in  the  stady  were  168  college  undergraduates.  Embedded  questions  had  the 
effect  of  bringing  subjective  beliefs  regai'ding  test  readiness  in  better  calibration  with 
objective  test  prepai-edness  and  may  thus  be  used  to  change  the  passive  and  dysfunctional 
relationship  many  readers  have  with  the  text. 
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CHAPTER  1 
INTRODUCTION 


Reading  expository  prose  is  one  of  the  primary  mechanisms  through  which 
students  are  expected  to  acquire  knowledge  in  academic  settings.  The  principal  motivation 
for  such  reading  is  usually  to  prepai'e  for  a  test.  Students  read  and  study  the  text  until  they 
believe  they  have  adequately  learned  the  material.  Ideally  their  subjective  judgment  of  what 
they  have  leai-ned  should  coiTespond  with  their  test  performance.  When  subjective  beUefs 
and  judgments  regai'ding  peif  ormance  ai'e  coixelated  with  actual  performance,  the  learner  is 
said  to  be  calibrated  (Glendberg,  Sanocki,  Epstein,  &  Morris,  1987;  Lichtenstein, 
Fischoff,  &  Phillips,  1982).  This  study  concerns  readers'  subjective  beliefs  and  judgments 
regarding  test  readiness  as  they  relate  to  subsequent  test  performance. 

Perceptions  of  comprehension  and  cognitive  readiness  are  salient  features  of 
academic  reading  and  studying  and  the  principal  determinants  of  the  learning  strategies 
readers  employ  and  the  cognitive  resources  they  expend.  For  example,  if  during  the  course 
of  reading  readers  detect  that  their  comprehension  has  failed,  they  are  Ukely  to  engage  in 
some  form  of  remedial  behavior,  possibly  rereading  previous  material  more  carefully  or 
applying  a  different  learning  strategy.  If  comprehension  failure  goes  undetected,  the  reader 
will  not  engage  in  sti^ategic  behavior  and  will  be  likely  to  stop  reading  before  the  material 
has  been  learned.  This  problem  is  particularly  ti-oublesome  when  the  subject  matter  of  the 
text  is  cumulative  in  nature,  such  as  in  science  or  mathematics  texts.  Students  who  do  not 
detect  cognitive  failure  during  reading  are  likely  to  do  poorly  on  initial  tests  and 
progressively  worse  as  the  text  and  tests  incoiporate  earlier  concepts.  Clearly,  undetected 
cognitive  failure  has  severe  consequences. 
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Perception  of  cognitive  readiness  is  a  part  of  metacognition  research,  which  is 
concerned  with  people's  awareness  of  their  abihties  to  perceive,  understand,  and 
remember.  Metacognition  research  does  not  investigate  the  processes  themselves,  but  the 
learner's  awareness  of  them  (Otero  &  Campanaiio,  1990).  This  awareness  can  be  thought 
of  as  the  mental  activity  of  reflecting  upon  some  aspect  of  our  own  cognition.  Learner 
reflections  upon  comprehension  and  memory,  two  key  aspects  of  cognition,  are  called 
metacomprehension  and  metamemory.  However,  each  of  these  terms  commonly  takes  on 
varied  meanings  within  the  field,  making  it  difficult  to  determine  how  current  definitions 
contribute  to  theory  construction.  The  definitions  offered  in  this  study  are  not  intended  to 
represent  any  particular  theory  but  are  a  means  to  delimit  the  hypotheses  under 
consideration. 

Reseai'chers  have  documented  that  undetected  cognitive  failure  is  a  problem 
common  to  younger,  less  experienced  readers  (Baker  &  Brown,  1984a,  1984b;  Brown, 
Ai-mbnister,  &  Baker,  1986;  Wagoner,  1983),  but  college  students  often  perform  poorly 
on  exanunations  for  which  they  feel  adequately  prepared.  Although  there  are  explanations 
for  this  beyond  undetected  cognitive  failure,  experimental  work  supports  the  view  that 
college  students  often  do  not  detect  comprehension  failure  during  reading  relative  to  a  given 
task  (Epstein,  Glendberg,  &  Bradley,  1984;  Glendberg  &  Epstein,  1987;  Glendberg, 
Wilkinson,  &  Epstein,  1982;  Maki  &  Beixy,  1984;  Pressley,  Snyder,  Levin,  Muiray,  & 
Ghatala,  1987). 

In  most  studies,  readers  are  either  asked  to  make  predictions  of  performance  before 
taking  an  examination  (prediction  calibration)  or  estimate  performance  at  various  points 
during  the  test  itself  (postdiction  calibration).  In  the  first  instance,  coiTelations  between 
predictions  and  actual  performance  have  been  near  zero  (Glendberg  &  Epstein,  1985). 
Correlations  between  postdiction  judgments  and  actual  performance,  however,  have  been 
much  lai-ger  (Metcalfe,  1986).  Several  reseai-chers  have  suggested  that  the  test  provides 
feedback  that  readers  use  to  make  more  accurate  judgments  (Glendberg  et  al.,  1985, 1987; 


Pressley  et  al.,  1987;  Walczyk  &  Hall,  1989).  The  judgments  before  testing  should  be  a 
primary  focus  of  reseai-ch,  for  only  they  influence  decisions  to  use  different  study  strategies 
and/or  expend  greater  effort. 

Many  substantive  issues  must  be  addressed  regai'ding  adults'  abihties  to  monitor 
tlieir  reading,  including  deteimining  the  cognitive  processes  associated  with  accurate 
perceptions  of  test  readiness  and  specifying  the  sources  of  individual  differences.  Yet, 
even  if  a  thorough  explanation  for  these  monitoring  problems  existed,  there  are  no 
guarantees  that  they  could  be  remedied  by  some  form  of  direct  instruction.  With  this  in 
mind,  it  may  be  pmdent  to  explore  instructional  interventions  that  can  assist  readers  during 
the  reading  process,  with  the  goal  of  bringing  subjective  judgments  of  test  readiness  in 
closer  con'espondence  with  an  objective  assessment  of  test  performance. 

Statement  of  the  Problem 
The  pm-pose  of  this  study  was  to  detennine  whether  questions  embedded  in 
expository  text  improve  the  coiTespondence  between  adult  readers'  subjective  assessments 
of  test  readiness  and  their  objective  test  performance  (prediction  calibration).  A  secondary 
consideration  was  whether  embedded  questions  have  an  impact  on  postdiction  judgments 
of  performance  (postdiction  calibration). 

Questi  ons  embedded  in  text  have  been  found  effective  in  the  leai'ning  of  prose 
materials  (Andre,  1987)  and  may  exert  a  positive  influence  on  cognitive  monitoring  by 
altering  the  readers'  perceptions  of  reading  task.  Pressley  et  al.  (1987)  tested  the  impact  of 
embedded  questions  on  perceptions  of  cognitive  readiness  and  found  that  questions 
interspersed  in  the  text  had  the  effect  of  making  estimates  of  prediction  calibration 
statistically  more  accurate. 

Cognitive  science  research  findings  suggest  that  embedded  questions  improve 
subjective  assessments  of  test  readiness  in  a  number  of  ways.  For  example,  the  cognitive 
processes  used  to  answer  test  questions  may  provide  feedback  to  learners  that  they  can  then 


use  to  make  more  accurate  judgments  regarding  cognitive  readiness  (Glendberg  et  al., 
1987).  Embedded  questions  may  also  serve  as  a  prosthetic  device,  triggering  readers  to 
evaluate  theii"  comprehension  where  they  may  not  have  done  so  on  their  own.  If  they 
cannot  answer  the  embedded  questions,  they  can  use  this  knowledge  to  readjust  their 
learning  strategies.  Embedded  questions  may  also  sei-ve  as  a  benchmark  by  informing 
readers  of  the  semantic  level  to  which  they  should  direct  their  reading.  Although  isolating 
and  testing  these  effects  can  be  difficult,  each  shares  the  outcome  of  altering  readers' 
perceptions,  a  variable  more  easily  measured.  Researchers  have  manipulated  task 
characteristics  or  orientating  instructions  and  altered  readers'  ratings  of  perceived 
comprehension  (Pratt,  Luszcz,  MacKenzie-Keating,  &  Manning,  1982;  Shaughnessy, 
1981)  as  well  as  their  reported  sense  of  cognitive  readiness  (Brown,  Bransford,  Feixara,  & 
Campione,  1982).  The  purpose  of  the  the  present  study  was  to  discover  the  impact  of 
embedded  questions  on  perceptions  of  cognitive  readiness  as  well  as  on  a  number  of  other 
salient  perceptions.  When  this  reseai'ch  question  was  tested  previously  (Pressley  et  al, 
1987),  readers  were  given  expUcit  instructions  to  answer  the  embedded  questions.  Most 
reading  settings,  however,  are  learner  controlled  and  subject  to  the  nuances  of  reader 
decision  making.  What  has  yet  to  be  determined  are  the  effects  of  embedded  questions  on 
calibration  of  comprehension  when  readers  are  not  expHcitly  cued,  a  condition  adopted  in 
the  present  study  to  promote  the  generalizability  of  findings. 

Outline  of  the  Study 
The  two  manipulated  factors  in  this  study  were  embedded  questions  (yes/no)  and 
lookbacks  to  previously  examined  text  allowed  (yes/no),  and  they  were  crossed  to  produce 
four  experimental  conditions— embedded  questions  with  lookbacks  allowed,  embedded 
questions  witli  lookbacks  prohibited,  no  embedded  questions  with  lookbacks  allowed,  and 
no  embedded  questions  with  lookbacks  prohibited.  One-hundred  and  sixty-eight 
undergraduate  college  students  from  a  large,  public  university  in  the  South  were  utilized  as 


a  sample  and  randomly  assigned  to  one  of  the  four  experimental  conditions.  They  were 
asked  to  read  and  study  approximately  1700  words  of  text  from  Xenograde  Science 
(MeiTill,  1965)  presented  on  a  computer  tenninal.  This  text  is  a  hierarchical  learning  set  in 
which  concepts  build  upon  each  other  describing  the  physics  of  a  make-believe  solar 
system.  Unfamiliar  material  was  used  to  minimize  the  confounding  effects  of  prior 
knowledge,  and  it  has  been  reported  to  produce  more  conscious,  analytical  processing  than 
material  more  familiar  to  subjects  (Hare  &  Smith,  1982).  After  studying  the  Xenograde 
text  and  before  taking  the  test,  readers  responded  to  a  series  of  questions  that  asked  them  to 
make  compai^ative  judgments  about  themselves  as  learners.  Pilot  testing  revealed  that  the 
material  was  of  sufficient  difficulty  to  detemiine  that  effects  were  not  artificially  constrained 
by  floor  or  ceiling  effects. 

The  goal  of  this  study  was  to  examine  the  effects  of  embedded  questions  on 
readers'  prediction  and  postdiction  calibration.  The  coirespondence  between  subjective 
judgments  made  before  the  criterion  test  and  subsequent  test  performance  was  used  to 
operationalize  prediction  cahbration.  The  coirespondence  between  subjective  judgments 
made  at  the  time  of  testing  and  actual  test  perfomiance  was  used  to  operationahze 
postdiction  calibration. 

Research  Hypotheses 
Prediction  Measures  of  Calibration 

Readers'  judgments  of  tlieir  understanding  of  text  are  said  to  be  calibrated  when 
their  subjective  beliefs  regarding  perfomiance  ai'e  coiTelated  with  their  objective 
performance  (Glendberg  et  al.,  1987).  The  key  is  whether  embedded  questions  can 
improve  this  correspondence.  Readers  with  better  calibration  should  surely  have  a  more 
accurate  sense  of  test  readiness,  and  measures  of  calibration  will  have  gi^eater  validity  if 
perceptions  of  test  readiness  ai"e  connected  with  perceived  need  for  strategic  behavior. 
Readers  who  believe  tliey  do  not  understand  the  material  should  express  the  need  to 


restudy.  Pressley  et  al.  (1987)  have  created  a  prediction  calibration  measure  of  readers' 
"perceived  readiness  for  examination  performance"  (p.  222),  designated  by  its  acronym 
PREP,  explained  in  the  methodology  chapter  of  this  study. 

The  2X2  design  made  it  possible  to  test  the  effects  of  embedded  questions  in  an 
ecologically  valid  setting  (lookbacks)  and  under  conditions  that  allowed  for  experimental 
comparison  (no-lookbacks).  Looking  back  to  material  previously  read,  but  currently  not 
remembered,  is  a  common  reader  practice  and  strategic  response  to  comprehension  failure. 
By  adding  the  no-lookback  factor  it  was  possible  to  sepai-ate  the  effects  of  embedded 
questions  on  perceptions  of  cognitive  readiness  when  combined  with  rereading  decisions 
from  the  effects  of  embedded  questions  when  rereading  was  prohibited. 

Subjects  were  also  asked  to  predict  the  score  they  would  obtain  on  the  test  after 
studying  the  Xenogi^ade  text.  This  subjective  assessment  of  cognitive  readiness  was 
compared  to  subsequent  test  performance.  Results  from  pilot  testing  suggested  that 
participants  almost  always  overestimate  then-  readiness.  In  this  study,  the  discordance 
between  the  predicted  and  subsequent  objective  scores  has  been  operationahzed  as 
prediction  inaccuracy,  or  PI,  and  is  calculated  by  taking  the  absolute  difference  between 
these  two  scores.  Results  from  the  PREP  instrument,  PI  measures,  and  the  confidence 
scores  readers  assigned  to  their  test  responses  served  to  determine  if  embedded  questions 
improve  calibration. 
Postdiction  Judgments  of  Calibration 

For  each  item  on  the  19-item  criterion  test,  subjects  were  asked  to  predict  if  they 
answered  the  question  coiTecdy,  (yes/no).  They  were  then  asked  to  rate  their  confidence  in 
their  yes/no  choice  on  a  6-point  Likert  scale.  Each  confidence  score  was  multiplied  by  +1 
for  a  "hit"  (predicted  connect  and  answer  was  coirect  or  predicted  wrong  and  item  was 
wrong)  and  -1  for  each  "miss."  Differences  in  postdiction  accuracy  were  calculated  by 
dividing  the  sum  of  these  positive  and  negative  numbers  by  the  square  of  the  pooled 
variances  for  the  19  confidence  scores.  This  measure  is  an  adaptation  of  a  technique 


developed  by  Zimmerman,  Broder,  Shaughnessy,  and  Underwood  (1977)  and  in  the 
present  study  was  called  postdiction  judgments  of  calibration,  or  PJC.  Pait  of  the  rationale 
for  including  postdiction  measures  was  to  detemiine  if  there  were  any  patterns  of 
coiTelations  between  individual  prediction  calibration  performance  and  postdiction 
calibration  performance  and  to  investigate  whether  embedded  questions  or  the  lookback 
option  had  an  impact  on  this  relationship. 
Measures  of  Individual  Differences 

Pilot  studies  suggested  that  embedded  questions  alter  readers'  cognitive  processing, 
but  it  is  also  possible  that  these  questions  may  serve  readers  differentially  (Cronbach  & 
Snow,  1977).  Therefore,  it  is  important  to  know  if  intellectual  abilities  such  as  verbal- 
educational  ability  (Cattell,  1963)  and  visualization  (Como  &  Snow,  1986)  interact  with 
embedded  questions  in  the  prediction  of  both  examination  performance  and  individual 
measures  of  calibration.  Verbal  and  visualization  measui'es  ai^e  commonly  ti^eated  as 
representative  of  ciystalized  (Gc)  and  fluid  (Gf)  ability  (Horn,  1989).  The  former  requires 
knowledge  and  often  predicts  school  achievement  and  die  latter  is  a  form  of  reasoning  that 
does  not  depend  heavily  on  knowledge  but  is  related  to  many  mental  operations,  of  which 
processing  complex  spatial  tasks  (such  as  the  Xenograde  System)  is  one.  In  pilot  testing, 
verbal  and  visual  measures  from  the  Kit  of  Factor-Referenced  Cognitive  Tests  (Ekstrom, 
French,  &  Harman,  1976)  correlated  with  test  performance  differentially,  depending  on 
group  membership.  These  measures  were  administered  in  the  present  study,  and  it  was 
anticipated  that  embedded  questions  would  best  serve  those  readers  low  in  verbal  and 
visualization  ability. 

Importance  of  the  Study 
Reading  is  an  important  component  in  human  learning.  Readers'  perceptions  of 
cognitive  readiness  determine  the  types  of  learning  strategies  they  employ  and  the  amount 
of  mental  energy  they  invest.  Current  evidence  suggests  that  undetected  cognitive  failure 


may  be  a  problem  even  among  adult  skilled  readers.  Difficulties  associated  with 
comprehension  and  memory  monitoring  have  been  investigated  from  different  research 
perspectives.  Regai'dless  of  the  approach,  each  has  dealt  with  the  problem  that  reading  for 
remembering  is  a  learner-controlled  process  that  does  not  "yield  readily  to  observation, 
measurement,  and  control"  (Thomas  &  Rohwer,  1986,  p.  19).  The  challenge  has  been  to 
devise  powerful  measurement  techniques  that  can  externalize  these  complex  mental  events 
while  simultaneously  ensuring  that  the  measurements  do  not  alter  the  subjects'  reading 
process  (Brown,  1980).  Although  this  study  offers  a  practical  solution  to  the  pei'vasive 
problem  of  undetected  cognitive  failure  during  reading,  its  principal  contiibution  is  as  a 
prototype  for  the  study  of  cognitive  monitoring.  Embedded  questions  are  a  practical 
intervention  that  is  both  inexpensive  and  subject  to  manipulation,  and  their  impact  on 
readers  is  experimentally  observable.  Research  evidence  has  demonstrated  that  different 
embedded  question  formats  invoke  different  cognitive  processes  (Hamaker,  1986).  If  it  is 
determined  that  embedded  questions  can  affect  the  subjective  assessments  of  test  readiness, 
then  the  characteristics  of  the  questions  themselves  can  be  manipulated  (sequencing,  types, 
amounts)  and  their  differential  effects  on  measures  of  cognitive  monitoring  deteimined. 
Relationships  so  uncovered  would  be  used  to  inform  theoiy  construction  and  testing.  The 
proper  intei"vention  would  remedy  practical  problems  as  well  as  serve  as  an  effective 
experimental  tool. 

Additionally,  cognitive  monitoring  reseai'ch  has  been  largely  concerned  witli 
younger  reader-older  reader  and  good  reader-poor  reader  comparisons.  These  between 
group  differences  in  cognitive  monitoring  ability  have  been  explained  in  terms  of  the 
qualitative  differences  in  learners'  prior  knowledge  and  reading  ability.  The  purpose  of  the 
present  study  was  to  investigate  individual  differences  in  cognitive  monitoring  within 
groups  similai'  to  each  other. 


CHAPTER  2 
REVIEW  OF  THE  LITERATURE 


Many  of  the  processes  associated  with  cognitive  monitoring  remain  elusive  despite 
intensive  investigation.  The  first  part  of  this  review  will  explore  Uteratare  relevant  to 
cognitive  monitoring  during  reading.  One  of  the  central  difficulties  in  conducting  cognitive 
monitoiing  reseai'ch  has  been  determining  how  to  externalize  these  complex  mental  events, 
and  the  second  part  will  review  research  tactics  used  to  circumvent  this  difficulty.  The  use 
of  embedded  questions  in  conjunction  with  cognitive  monitoring  research  is  a  relatively 
new  idea,  and  a  synthesis  of  this  literature  will  complete  the  review. 

Cognitive  Monitoring  During  Reading 
The  smdy  of  cognitive  monitoring  duiing  reading,  and  the  observation  that  some 
people  are  not  good  at  it,  is  not  new.  Brown  and  Baker  (1984)  reminded  us  that  "reading 
researchers  since  the  turn  of  the  century  (e.g.,  Dewey,  1910;  Huey,  1908/1968; 
Thomdike,  1917)  have  been  aware  that  reading  involves  planning,  checking,  and 
evaluating,  activities  now  regarded  as  metacognitive  skills"  (p  354).  As  have  almost  all 
branches  of  educational  inquiiy,  that  of  cognitive  monitoring  has  been  governed  by  the 
study  of  children.  One  robust  and  pervasive  finding  has  emerged  from  this  research: 
children  often  fail  to  detect  comprehension  problems  during  reading  (see  Baker  &  Brown, 
1984a,  1984b;  Brown  et  al.,  1986;  Wagoner,  1983,  for  reviews).  Research  with  adult 
subjects,  however,  has  been  sparse,  perhaps  due  to  the  tacit  assumption  that  adults 
adequately  monitor  their  cognition  during  reading.  Recent  inquiiy  using  adult  subjects  has 
called  this  premise  into  question. 
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Non-optimal  Cognitive  Monitoring  in  Adults 

Early  researchers  examined  subjects'  abilities  to  make  judgments  about  the  contents 
of  their  memory  for  tasks  other  than  reading.  Fischhoff,  Slovic,  and  Lichtensteing  (1977) 
asked  college-aged  subjects  a  number  of  general  knowledge  questions,  had  them  assign  a 
probability  estimate  to  their  answers,  and  discovered  that  subjects  were  poor  estimators  of 
their  knowledge.  Koriat,  Lichtenstein,  and  Fischhoff  (1980)  operationalized  the  term 
calibration  and  summarized  findings  that  used  a  variety  of  question  and  response  formats. 

An  individual  is  well  calibrated  if,  over  the  long  run,  for  all  answers 
assigned  a  given  probability,  the  proportion  coirect  equals  the  probability 
assigned.  The  general  conclusion  from  these  studies  is  that  people  are 
rather  poorly  calibrated.  Although  higher  probabilities  are  typically 
associated  with  larger  hit  rates,  in  an  absolute  sense  the  probabilities  and  hit 
rates  diverge  considerably.  The  major  systematic  deviation  from  perfect 
caUbration  is  overconfidence,  an  unwaiTanted  belief  in  the  correctness  of 
one's  answer,  (p.  108) 

Similar  findings  have  been  found  when  expository  text  was  the  experimental  task. 

Shaughnessy  (1981)  had  students  in  an  inti-oductory  psychology  course  rate  the  probability 

that  they  had  chosen  the  coirect  alternative  on  a  multiple-choice  test  based  on  text  material. 

He  found  that  all  students  were  better  in  their  prediction  than  they  would  have  been  by 

chance  alone,  but  many  of  the  poorer  readers  had  very  high  confidence  in  thek  wrong 

answers.  In  a  similar  experiment,  Waem  and  Askwall  (1981)  found  that  correlations 

between  confidence  and  correct  answers  were  nonsignificant.  Pressley  and  Ghatala  (1988) 

had  college-aged  subjects  read  short  passages  from  the  Scholastic  Aptitude  Test  (College 

Entrance  Examination  Board,  1986)  and  answer  15  reading  comprehension  items.  Readers 

in  this  experiment  indicated  on  an  8-item  confidence  scale  how  certain  they  were  that  their 

answers  were  con-ect  in  the  presence  of  the  answer  they  chose,  the  question  they  responded 

to,  and  the  passage  they  were  asked  to  read.  In  this  experiment,  readers  were  able  to 

discriminate  between  correct  and  incortect  items  at  a  statistically  significant  level  (p  <  .05). 

These  studies  coixespond  to  what  has  been  defined  as  postdiction  calibration  (estimations 

of  perfoiTnance  in  the  presence  of  the  test).  More  central  to  the  interest  of  this  study. 
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however,  are  the  subjective  behefs  and  judgments  regarding  cognitive  readiness  that  occur 
before  encountering  the  test,  prediction  calibration.  These  are  the  behefs  that  guide 
sti^ategic  decision-making  during  the  process  of  reading. 

One  technique  for  determining  if  readers  monitor  their  cognition  during  reading  has 
\  been  to  embed  errors  or  contradictions  within  the  text  (Markman,  1977,  1979;  Winogi-ad  & 

Johnston,  1982).  The  technique  is  based  on  the  assumption  that  readers  construct  meaning 
during  reading;  therefore,  detecting  eiTors  would  be  evidence  of  comprehension 
monitoring.  Glendberg  et  al.  (1982)  were  the  first  to  use  this  technique  with  adult  readers. 
In  three  experiments,  they  used  a  1,600- word  college- level  text  and  embedded  sentences 
that  were  in  contradiction  with  an  inference  derivable  from  previous  sentences.  They 
reported  that  95%  of  the  subjects  failed  to  detect  the  contradictions  while  concomitantly 
indicating  they  were  fairly  certain  or  very  certain  they  had  understood  the  passage.  The 
investigators  called  this  joint  occurrence  an  illusion  of  knowing- when  readers  indicate  their 
comprehension  is  high  but  an  objective  measure  indicates  comprehension  failure.  In 
another  experiment,  they  had  subjects  in  an  introductory  psychology  course  read  three 
different  expository  texts.  This  time  the  contradictory  material  was  explicitly  presented  in 
two  sentences  that  were  adjacent  to  each  other  and  at  the  end  of  the  text.  These  college- 
aged  subjects  failed  to  detect  5 1%  of  tlie  contradictions  even  when  they  were  explicitly  told 
that  the  text  could  contain  one  or  more  contradictions  and  tliat  they  should  search  for  them. 
Maki  and  Beny  (1984)  used  college  students  as  subjects  but  did  not  adulterate  the 
text  with  contradictory  material.  Participants  read  four  segments  from  an  introductory 
psychology  text  (about  12  pages  each).  These  segments  were  further  divided  by  headings 
and  subheadings  that  con-esponded  to  distinct  ideas  within  the  reading.  Subjects  were 
instructed  to  read  at  their  own  pace,  to  read  each  section  carefully  but  to  read  it  only  once, 
and  to  read  in  preparation  for  a  multiple-choice  test  to  be  administered  the  following  day.  At 
the  end  of  each  reading  session,  subjects  indicated  on  a  6-point  rating  scale  how  confident 
they  were  that  they  could  answer  a  question  on  the  material  (l=surely  answer  wrong  to 
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6=surely  answer  correct).  Maki  and  Berry  examined  a  number  of  different  variables 
(ability,  serial  position,  item  difficulty,  and  test  delay),  and  even  in  those  experimental 
conditions  that  produced  the  most  accurate  predictions  the  mean  significant  positive 
coirelations  between  test  performance  and  the  ratings  were  below  .23. 

Glendberg  and  Epstein  (1985)  re-examined  the  phenomena  refeffed  to  as  the 
illusion  of  knowing.  They  abandoned  the  error  detection  paradigm  for  methodological 
reasons  that  will  be  discussed  later  and  used  15  intact  one-pai-agi-aph  expositions  dealing 
with  a  variety  of  unrelated  topics.  After  reading  each  topic,  subjects  reported  the 
confidence  they  had  in  being  able  to  use  what  they  had  learned  to  draw  correct  inferences 
from  the  text.  The  criterion  measui'e  was  an  inference  that  was  related  to  the  central 
proposition  of  the  text.  Two  versions  of  each  inference  were  prepared:  one  correct  and  the 
other  incorrect.  In  44%  of  the  cases  higher  confidence  scores  were  associated  with  lower 
performance  on  the  objective  test.  Concerned  that  perhaps  poor  cahbration  was  due  to 
inappropriate  expectations  concerning  the  type  of  reading  and  test  the  reader  would 
encounter,  Glendberg  et  al.,  (1985)  added  a  control  that  familiarized  the  reader  with  both 
elements.  In  this  second  experiment,  congelations  between  subjective  assessments  and 
objective  performance  were  once  again  non-significant. 

These  studies  represent  the  bulk  of  the  reseaixh  that  has  had  as  its  primary  purpose 
the  compaiison  of  adult  readers'  subjective  assessments  of  comprehension  with  an 
objective  perfomiance  measure.  Their  findings  offer  strong  evidence  of  non-optimal 
cognitive  monitoring  in  reading  by  adults,  something  that  Baker  and  Brown  (1984a) 
predicted,  because  college  students  receive  no  formal  instruction  in  evaluating  and 
regulating  theii"  cognition.  Much  more  cognitive  monitoring  research  has  been  conducted, 
but  this  work  has  not  dealt  directly  with  issues  of  prediction  calibration  in  adult  readers 
(icing,  Zechmeister,  &  Shaughnessy,  1980;  Koriat  et  al.,  1980;  Lichtenstein  et  al.,  1982; 
Nelson,  Leonesio,  Sliimamura,  Landwehy,  &  Nerens,  1982;  Pratt  et  al.,  1982;  Schacter, 
1983;  Shaughnessy,  1981). 
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There  are,  however,  methodological  and  design  problems  that  must  be  addressed. 
Also,  Baker  and  Anderson  (1982)  raised  serious  doubts  about  the  reliabihty  of  the 
contradiction  paradigm,  and  there  is  some  question  that  overconfidence  in  comprehension 
may  be  due  to  scaling  issues.  For  example,  Maki  and  Beny's  (1984)  mean  ratings  were 
rai'ely  below  4  on  a  6-point  scale.  Weaver  (1990)  used  a  simulation  technique  and 
demonstrated  that  calibration  scores  in  the  study  by  Glendberg  and  Epstein  (1985)  were 
artificially  constrained  by  using  only  one  inference  item  per  text.  In  total,  much  work 
remains  before  we  have  a  thorough  understanding  of  how  perceptions  of  cognitive 
readiness  emerge  during  reading. 
Cognitive  Monitoring  and  Perceptions  of  Cognitive  Readiness 

Common  sense  suggests  that  most  adults  monitor  then-  cognition  during  reading. 
Under  normal  circumstances  reading  is  a  routine  process,  proceeding  smoothly  and  with 
little  effort  unless  a  difficult  term  or  concept  is  encountered  and  recognized  as  such,  at 
which  point  most  readers  slow  their  reading  and  reallocate  their  attention  accordingly 
(Kintsch  &  van  Diijk,  1978),  a  clear  example  of  comprehension  monitoring.  In  addition, 
all  readers  have  also  experienced  the  realization  that  material  previously  read  cannot  be 
remembered.  When  readers  engage  in  cognitive  monitoring  to  alleviate  either  condition, 
they  are  demonstrating  metacomprehension  and  metamemory. 

Although  adults  can  monitor  their  cognition,  reseai^chers  suggest  that  they  often  do 
not  detect  comprehension  failure  during  reading,  clear  lack  of  support  for  the  tacit 
assumption  that  college-aged  subjects  are  proficient  at  monitoring  their  cognition.  An 
information  processing  pai^adigm  of  learning  and  contemporary  reading  theories  might  be 
useful  as  a  framework  for  discussing  this  anomaly.  A  word  of  warning  is  in  order:  The 
processes  associated  with  cognitive  monitoring  are  not  completely  understood.  It  logically 
follows  that  the  language  theorists  have  used  to  describe  these  processes  is  equally,  if  not 
more,  vague.  Furthermore,  some  of  the  language  used  to  describe  these  processes  may  be 
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setting  up  what  Sternberg  (1985)  referred  to  as  a  "trap,"  which  will  be  discussed  later  in  the 
chapter. 

In  its  most  basic  rendering,  the  infoiTnation  processing  view  of  human  cognition 
defines  reading  and  learning  essentially  as  the  processes  of  problem  solving  and  memory 
access  (Shuell,  1986).  For  purposes  of  this  study,  three  elements  of  the  information 
processing  system  are  of  interest— the  envii'onment,  working  memory,  and  long-term 
memory  (prior  knowledge).  Stimuli  from  the  environment  (in  our  case  text  material) 
provide  input  to  the  system  and  establish  the  problem  set  (Greeno,  1977),  which  can  be 
defined  as  the  efforts  a  reader  makes  to  understand  the  material.  The  reader  engages  in  a 
search  of  long-term  memory  for  infomiation  relevant  to  the  problem  set.  Cognitive  codes 
from  prior  knowledge  are  combined  with  environmental  representations  in  working 
memory.  This  information  is  transformed  in  ways  that  determine  how  and  what  will  later 
be  recalled  (Gitomer  &  Glaser,  1987;  Haixis,  1981). 

Anderson,  Reynolds,  Shallert,  and  Goetz  (1977)  concluded  that  every  act  of 
comprehension  involves  prior  knowledge,  and  what  a  person  learns  from  reading  is  highly 
dependent  on  it.  Readers  attempt  to  constiiict  meaning  by  combining  elements  of  the  text 
with  background  knowledge  stored  in  long-term  memory  (Anderson  &  Pearson,  1984). 
'  Anderson  (1976)  described  this  prior  knowledge  as  an  associative  network  of 

interconnected  concepts  commonly  refeiTed  to  as  schema.  Learning  during  reading  occurs 
when  the  reader  constructs  meaning  from  the  text  and  updates  his  prior  knowledge,  or 
schema.  This  iterative  interaction  between  text  and  prior  knowledge  is  described  by  Moms 
(1990). 

The  mental  models  constructed  by  comprehenders  during  discourse 
processing  must  continually  be  updated,  revised,  and  often  enough, 
completely  abandoned  in  favor  of  models  more  consistent  with  the  available 
information.  Thus,  knowledge  acquisition  requires  continual  evaluation 
[emphasis  added]  of  the  evolving  knowledge  state  to  inform  the  local  and 
global  organization  of  discourse  processing  and  other  study  behavior, 
(p.  223) 


15 


Although  Morris  described  how  we  might  learn  from  reading  (updating  our  mental 
models),  we  have  yet  to  specify  how  we  might  evaluate  our  understanding  during  reading. 
Evaluation,  or  cognitive  monitoring,  must  certainly  be  a  pre-requisite  to  a  decision  to 
"revise"  our  existing  mental  models.  Kintsch  (1979)  proposed  that  at  the  molecular  level 
readers  test  (evaluate,  cognitively  monitor)  individual  words  against  prior  knowledge  to 
determine  if  they  make  sense.  Reading  consists  of  combining  words  into  sentences  and 
sentences  into  lai'ger  structures.  According  to  Kintsch  (1979)  and  others  (Ii-win,  1986; 
Johnson  &  Barrett,  1981;  Rumelhart,  1977),  each  progressively  larger  idea  unit  must  be 
tested  against  prior  knowledge  in  the  attempt  to  constiiict  meaning  from  reading. 
Glendberg  and  Epstein  (1985)  described  how  this  hypotheses  testing  process  might 
proceed. 

Efficient  monitoring  might  consist  of  a  succession  of  self-activated 
inferences  or  anticipations  which  are  checked  against  incoming  text. 
Repeated  verifications  of  these  inferences  signals  the  development  of  a 
knowledge  structure  which  is  coherent  and  which  is  isomorphic  with  the 
knowledge  structure  of  the  text.  On  the  other  hand,  if  some  number  of 
these  inferences  are  disconfirmed  this  wiU  signal  [emphasis  added]  an 
incomplete,  inaccurate,  or  inappropriate  knowledge  stixicture.  (p.  710) 

According  to  this  view,  in  order  to  learn  from  what  we  read  we  must  be  able  to 
actively  monitor  our  cognition  (to  revise  our  mental  models),  and  effective  cognitive 
monitoring  depends  on  the  integrity  of  the  prior  knowledge  against  which  we  test  our 
inferences  (Dreher  &  Singer,  1981).  Since  the  1970s  a  number  of  literature  reviews  have 
converged  on  one  very  robust  finding:  Older  and  more  able  readers  monitor  their  cognition 
during  reading  better  than  younger  and  less  able  readers  (Baker  &  Brown,  1984a;  Flavell  & 
Wellman,  1977;  Gamer,  1987;  Wagoner,  1983).  This  hypothesis  is  consistent  with  the 
theoretical  discussion  above— older  and  more  able  readers  have  better  formed  knowledge 
structures  against  which  to  test  their  inferences.  This  condition  leads  to  stronger  signals. 

One  might  appropriately  ask  what  is  the  nature  of  this  signal.  Flavell  (1981) 
suggested  that  metacognition  can  be  differentiated  into  metacognitive  knowledge  and 
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metacognitive  experiences.  The  signal  refen^ed  to  by  Glendberg  and  Epstein  (1985)  is  what 
FlaveU  (1981)  has  called  a  triggering  event,  a  cognitive  process  he  placed  under  the  rubric 
of  a  metacognitive  experience.  According  to  Flavell  (1981),  this  triggering  event  is 
proceeded  by  a  sensitivity.  Although  he  does  not  specify  the  mechanics  of  this  sensitivity, 
he  cites  "tip-of-the-tongue  experiences  "-when  someone  is  unable  to  recall  information  but 
knows  that  they  know  it-as  an  example  (Flavell  &  Welknan,  1977).  Gamer  (1987) 
described  these  metacognitive  experiences  as  awarenesses,  realizations,  or  "ahas"  and 
added  that  metacognitive  experiences  often  occui"  when  cognitions  fail. 

Although  concepts  such  as  signal  and  tiiggering  event  are  ambiguous,  they  help  at 
an  intuitive  level  to  understand  the  conditions  that  accompany  the  realization  that  one  does 
not  understand.  They  are  also  the  experiences  that  help  readers  adjust  their  perceptions  of 
cognitive  readiness.  The  critical  issue  is  undetected  cognitive  failure,  the  lack  of  reahzation 
that  one  does  not  understand.  Baker  and  Brown  (1984a)  pointed  out  that  readers  who  fail 
to  detect  cognitive  failure  will  have  the  same  feeUngs  as  those  who  detect  it;  however,  those 
who  do  not  detect  eiTors  in  understanding  are  unUkely  to  take  any  strategic  action  to 
remediate  understanding. 

What  then  is  the  role  of  knowledge  in  readers'  perceptions  of  cognitive  readiness? 
Baker  and  Brown  (1984a)  suggested  that  if  readers  are  able  to  confirm  or  disconfiiTn  a 
hypotheses,  they  can  acquire  knowledge  about  how  well  they  are  comprehending  a  text. 
For  example,  readers  encountering  the  word  "effluvia"  might  test  this  word  against  prior 
knowledge  and  have  the  realization  that  tliey  do  not  understand  it.  They  might  be 
absolutely  certain  tliey  do  not  know  what  the  word  means  and  therefore  be  able  to  generate 
a  strong  signal.  Perceptions  of  cognitive  readiness  would  emerge  naturally  from  such  a 
condition.  Expository  text  is,  however,  more  than  a  conglomeration  of  words.  According 
to  Kintsch  (1979),  text  is  composed  of  a  hierarchy  of  stixictures  that  readers  must  build  and 
test  against  prior  knowledge  in  an  attempt  to  constmct  "macrostmctures"  of  meaning. 
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Herein  lies  the  quandary:  Can  we  as  readers  always  confirm  or  disconfirm  that  we 
understand  what  we  are  reading? 

Nickerson  (1985)  reminded  us  that  understanding  is  qualitative  in  nature.  One  does 
not  simply  understand  fully  or  fail  to  understand  altogether.  The  criteria  used  to  measure 
understanding  of  concepts  or  phenomena  are  variable  and  vague,  and  so  it  is  equally  not 
suiprising  that  people  have  a  difficult  time  judging  their  own  reading  comprehension 
(Markman,  1981).  For  example,  Markman  (1979)  found  that  young  readers  often  judge 
their  comprehension  of  prose  to  be  adequate  when  individual  sentences  make  sense  even 
though  the  sentences  within  the  text  contradict  each  other.  Glendberg  et  al.  (1982)  reported 
similar  findings  with  adult  subjects. 

Researchers  trying  to  understand  the  processes  involved  in  cognitive  monitoring 
and  perceptions  of  cognitive  readiness  may  feel  caught  in  what  appears  to  be  a  theoretical 
circle.  During  the  process  of  reading  readers  attempt  to  construct  meaning  for  ever  larger 
text  stiaictures.  They  do  this  by  testing  incoming  infoimation  against  prior  knowledge  to 
determine  if  this  infoimation  makes  sense.  If  readers  are  able  to  confrnn  or  disconfirm 
their  hypotheses  about  what  they  are  reading,  they  can  then  generate  signals  about  their 
comprehension  and  cognitive  readiness.  However,  the  nature  of  these  signals  has  not  been 
adequately  specified  and  remains  highly  ambiguous.  Although  signals  emerge  out  of 
detected  cognitive  failure,  this  knowledge  is  of  little  help  since  undetected  cognitive  failure 
is  the  problem,  and  readers  who  understand  incorrectly  have  much  the  same  feelings  as 
readers  who  understand  correctly.  In  addition,  readers'  criteria  for  testing  their 
understanding  ai^e  variable  and  open  to  error. 

Although  these  theoretical  formulations  may  be  of  limited  help,  the  challenge  is  to 
extemaUze  these  complex  mental  events  so  they  can  be  studied.  The  decision  is  how  to  go 
about  this.  When  age  and  reading  ability  differences  are  large,  perceptions  of  cognitive 
readiness  can  be  understood  in  terms  of  quahtative  differences  in  knowledge  structures. 
These  are  interesting  and  informative  between-group  findings.  What  remains  to  be 
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specified  is  how  to  account  for,  and  conduct  inquiry  into,  individual  differences  in 
perceptions  of  cognitive  readiness  that  emerge  within  groups  similar  to  each  other. 

With  research  focused  on  metacognitive  knowledge  in  groups  similar  to  each  other, 
it  is  possible  that  individual  differences  in  perceptions  of  cognitive  readiness  will  emerge 
from  an  infinite  regression  of  finer  gi-adations  of  prior  knowledge.    The  problem  would  be 
to  deteiTQine  what  prior  knowledge  to  measure  and  how  to  measure  it.  Given  the  ambiguity 
of  our  previous  discussion  of  metacognitive  experiences,  focusing  on  readers'  knowledge 
may  be  the  preferable  alternative,  and  herein  lies  the  ti"ap  that  Sternberg  (1985)  referred  to. 
Distinguishing  Between  Knowledge  of  Knowledge  and  Executive  Processes 

According  to  Sternberg  (1985)  and  others  (Anderson,  1976;  Bobrow,  1975; 
Cavanaugh  &  Perlmutter,  1982),  metacognition  research  has  constricted  and  restricted  its 
domain  of  inquiry  by  defining  metacognition  as  knowledge  about  knowledge.  By  over- 
relying  on  this  constinct,  reseaixhers  have  excluded  from  consideration  cognitive  processes 
(executive  processes)  that  might  be  useful  for  understanding  cognitive  monitoring  and 
perceptions  of  cognitive  readiness.  This  study  proposes  an  assessment  technique  that  will 
measure  these  processes  at  the  executive  level. 

The  hterature  is  replete  with  different  definitions  of  metacognition.  Most  contain 
elements  of  both  awareness  and  control.  For  example,  Baker  and  Anderson  (1982)  defined 
it  as  "one's  knowledge  and  control  of  one's  own  cognitive  processes"  (p.  282). 
Castleberry  (1984)  claimed  that  "research  concentrating  on  determining  the  reader's 
awareness  of  and  control  [emphasis  added]  over  the  factors  involved  in  the  reading  process 
is  called  metacomprehension  research"  (p.  205).  Lawson  (1984)  argued  that  confounding 
these  two  aspects  of  cognition  has  led  to  the  conflicting  patterns  of  findings  in 
metacognition  research.  Yussen  (1985)  and  others  (Anderson,  1976;  Bobrow,  1975; 
Cavanaugh  &  Perlmutter,  1982;  Gagne,  1977)  argued  that  awai'eness  and  control  should  be 
seen  as  logically  distinct. 
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Awareness,  when  used  in  the  context  of  self -referent  processes,  refers  to 
knowledge  about  knowledge.  To  qualify  as  awareness,  these  processes  must  logically  be 
the  by-product  of  deliberate  and  conscious  reflection  and,  hence,  should  be  verbalizable 
and  available  for  conscious  reporting.  Metacognitive  knowledge  is  a  consequence  of 
having  reflected  upon  our  cognition.  The  act  of  reflection  as  a  cognitive  function  is 
controlled  by  the  executive  processes.  According  to  the  information  processing 
perspective,  executive  processes  marshaU,  regulate,  orchestrate,  or  otherwise  control  other 
cognitive  processes  (Neisser,  1967;  Simon,  1979).  Brown  (1981)  and  Brown  and  Baker 
(1984)  labeled  verbalizable  knowledge  as  static  knowledge  and  executive  processes  as 
strategic  knowledge.  In  Brown's  conceptualization,  the  role  of  strategic  knowledge  is  to 
plan,  analyze,  modify,  and  evaluate  other  cognitive  processes. 

Metacognitive  knowledge  and  executive  processes  differ  from  each  other  in  two 
,j  important  ways.  First,  the  cognitive  processes  associated  with  knowledge  about 

knowledge  are  accessible  to  conscious  awareness  and  hence  reportable.  Executive 
processes,  on  the  other  hand,  may  or  may  not  be  reportable.  In  the  process  of  developing 
expertise,  individuals  find  that  many  cognitive  activities  formerly  engaged  in  at  a  conscious 
level,  and  hence  the  product  of  conscious  awareness  and  verbalizable,  slowly  become 
automatic,  or  proceduralized,  and  therefore  unreportable  (Shiffrin  &  Dumais,  1981).  For 
example,  children  often  learn  to  tie  their  shoes  by  learning  a  verbal  description  of  what  is 
required  (the  rabbit  jumps  over  the  log).  With  practice,  shoe  tieing  becomes  a  completely 
automatic  process,  and  the  verbal  knowledge  is  no  longer  needed  or  used.  According  to 
Anderson  (1982),  this  processes  is  called  knowledge  compilation,  and  is  composed  of  two 
subprocesses,  proceduralization  and  composition.  During  proceduralization  the  declarative 
cues  that  once  guided  action  sequences  are  dropped  from  use.  Compilation  is  the  process 
of  collapsing  several  procedures  into  a  lesser  number  of  action  sequences.  For  this  reason, 
experts  may  find  it  difficult  to  verbalize  then-  cognitive  processes;  separate  skiUs  that  were 
once  dependent  on  declarative  cues  can  now  be  applied  as  one  compiled  procedure-they 
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can  apply  the  correct  algorithm  with  very  little  conscious  effort  (Simon  &  Simon,  1978). 
Alternatively,  when  novices  are  in  the  process  of  learning  the  separate  components  of  a 
skiU,  they  may  still  be  dependent  upon  declarative  cues  and  hence  be  readily  able  to  report 
their  introspections  about  the  processes  involved.  Although  novices  would  be  better  able  to 
report  metacognitive  knowledge,  they  would  remain  inferior  to  the  expert  in  being  able  to 
regulate  cognition.  If  this  proposition  is  accurate  it  would  put  into  serious  question  the 
tactic  of  having  readers  report  on  their  introspections  as  a  means  to  study  cognitive 
monitoring.  Given  this  distinction,  there  is  every  reason  to  beHeve  that  perceptions  of 
cognitive  readiness  exist  below  the  level  of  verbalizable,  conscious  awareness.  This  point 
is  particularly  critical  when  considering  normal  adult  reading  patterns.  For  many  adults, 
the  skills  associated  with  reading  have  been  so  well  practiced  that  reading  has  become  a 
domain-independent  and  highly  automatic  skill.  For  this  reason,  assessment  techniques 
must  be  found  to  externalize  the  outcomes  of  executive  processes  of  pre-  and  postdiction 
calibration. 

Executive  processes  also  differ  from  metacognitive  knowledge  in  terms  of  their 
place  in  the  hierarchy  of  human  cognition.  They  regulate  all  aspects  of  cognitive 
processing,  of  which  memory  monitoring  is  just  one.  Lawson  (1984)  argued  that  because 
of  their  high-level  nature,  executive  processes  ai^e  expected  to  have  a  more  general  influence 
across  most  cognitive  domains.  Metacognitive  knowledge,  on  the  other  hand,  is  expected 
to  be  domain,  and  even  task,  specific.  If  executive  skills  are  more  unifomily  transferable 
across  cognitive  activities  then  assessment  techniques  that  measure  these  processes  may  be 
more  generalizable.  According  to  Sternberg  (1 985),  metacognitive  research  that  reUes 
almost  exclusively  on  readers'  conscious  access  to  their  metacognitive  processes 
(knowledge  about  knowledge)  as  a  source  of  data  is  ill-conceived,  and  his  concerns  are 
probably  well-gi-ounded.  The  nature  of  our  alternatives  is  the  next  subject  for  discussion. 
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Externalizing  Complex  Mental  Events 
Metacognitive  research  is  at  a  point  where  investigators  ai^e  debating  the 
appropriateness  of  both  research  tactics  and  the  nature  of  the  questions  asked.  Because 
there  is  no  direct  way  of  determining  what  is  going  on  inside  readers'  heads,  creativity  is 
called  for  in  externalizing  these  complex  mental  events  (Brown,  1980;  Gamer,  Wagoner,  & 
Smith,  1983).  Because  no  perfect  measure  of  these  cognitive  processes  is  likely  to  be 
discovered,  what  is  needed  are  measures  in  the  spirit  of  converging  operations  (Campbell 
&  Fiske,  1959). 

Embedded  questions  are  part  of  the  repertoire  of  tactics  currentiy  being  used  to 
externalize  complex  mental  events,  though  to  this  point  they  have  been  utilized  only  by 
Pressley  et  al.  (1987)  as  a  technique  to  study  metacognitive  decision-making.  The  rationale 
for  formulating  the  dependent  variable  as  perceptions  regai'ding  cognitive  readiness  stems 
from  the  need  to  measure  reader  decision-making.  By  studying  metacognitive  decision- 
making, the  effects  of  both  metacognitive  knowledge  and  executive  processes  will  be 
measured  in  the  current  study.  This  approach  will  be  contrasted  with  techniques  that  rely 
on  readers'  conscious  access  to  metacognitive  knowledge. 
Measuiing  Metacognitive  Knowledge 

Although  Flavell  (1981)  was  vague  in  his  description  of  metacognitive  experiences, 
he  was  quite  definitive  in  formulating  a  taxonomy  of  metacognitive  knowledge  and 
distinguished  three  categories-person,  task,  and  strategy.  Person  refers  to  performance 
knowledge  individuals  have  about  their  limitations  and  capacities  to  monitor  experiences  in 
a  memory  task.  The  person's  awareness  of  how  task  demands  can  influence  memory 
performance  and  how  certain  strategic  choices  can  affect  storage  and  retrieval  refers  to  task 
and  strategy  components,  respectively  (Schneider,  1989).  Many  investigators  have  tiled  to 
understand  the  cognitive  processes  associated  with  cognitive  monitoring  by  asking  readers 
to  report  on  the  declarative  knowledge  they  have  about  themselves,  the  nature  of  the  task. 
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and  the  strategies  they  are  employing  in  relation  to  some  reading  material.  This  is  a 
knowledge  of  knowledge  approach  to  metacognition,  and  according  to  Baker  and  Brown 
(1984a)  the  goal  is  to  measure  the  stable  and  statable  information  readers  have  concerning 
the  cognitive  processes  involved  in  an  academic  task.  They  argue  that  the  information  is 
stable  because  readers  who  know,  for  example,  that  organized  material  is  easier  to  learn 
than  disorganized  material  will  continue  to  know  these  facts  if  interrogated  properly. 
Cavanaugh  and  Perhnutter  (1982)  argued  that  although  Flavell  never  suggested  that  his 
taxonomy  was  exhaustive,  some  writers  treat  it  as  if  it  were.  This  consensus  regarding 
what  knowledge  to  measure  has  not  been  accompanied  by  any  agreement  on  how  to 
measure  or  interrogate  readers. 

One  creative  alternative  to  interrogating  readers  directly  has  been  to  use  the  error 
detection  paradigm,  a  tactic  used  with  both  children  (Mai-kman,  1977;  Otero  & 
Campanario,  1990;  Winograd  &  Johnston,  1982)  and  adults  (Glendberg,  Wilkinson,  & 
Epstein,  1982;  Mai-kman,  1979)  that  consists  of  embedding  errors  (informational  gaps  or 
inconsistencies)  in  connected  prose  and  infoiTning  subjects  that  the  goal  of  the  task  was  to 
make  sense  of  the  material.  Readers'  verbalized  attention  to  these  errors  is  taken  as 
evidence  of  comprehension  monitoring  (Garner  &  Anderson,  1982).  Although  the  utiUty 
of  this  tactic  has  not  been  rejected  out-of-hand,  there  are  serious  concerns.  Glendberg  and 
Epstein  (1985)  warned  that  orientating  instmctions  to  detect  contradictions  may  affect 
subjects'  reading  strategies  in  ways  that  limit  the  generalizability  of  the  results.  In  addition, 
researchers  using  this  technique  have  been  unable  to  eliminate  rival  hypotheses  for  why 
someone  would  fail  to  detect  an  en-or  that  has  nothing  to  do  with  poor  cognitive 
monitoring.  For  example,  authority-questioning  reticence  and  inferential  "fix-ups"  of 
comprehension  errors  may  be  at  work  (Baker  &  Anderson,  1982;  Gamer  &  Anderson, 
1982;  Grice,  1975;  Winograd  &  Johnston,  1982).  By  careful  preparation  at  the  design 
stage.  Garner  and  Anderson  (1982)  were  able  to  overcome  many  of  these  obstacles.  The 
rival  hypothesis  for  which  they  could  not  control  was  that  "verbal  reports  (including  tlie 
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recall  data)  may  or  may  not  accurately  reflect  actual  states  of  knowledge  or  systems  of 
regulation"  (p.  74). 

The  entire  knowledge  about  knowledge  approach  rests  on  the  assumption  that 
memory  is  amenable  to  inspection  and  analysis  by  the  memorizer  (Cavanaugh  &  Permutter, 
1982).  In  other  words,  there  is  an  isomorphic  relationship  between  claim  and  reality.  That 
is,  learners  can  give  accurate  accounts  of  the  memory  processes.  This  controversy  is  as  old 
as  the  psychology  of  memory  (Ebbinghaus,  1885),  and  modern  commentators  have  argued 
that  there  is  almost  no  conscious  awareness  of  perceptual  and  memorial  tasks.  They  have 
argued  that  it  is  the  result  of  thinking,  not  the  processes  of  thinking,  that  appears 
spontaneously  in  consciousness  (Nisbett  &  Wilson,  1977). 

Although  some  investigators  (Cavanaugh  &  Permutter,  1982;  Meichenbaum, 
Burland,  Gurson,  &  Cameron,  1985)  have  questioned  that  any  amount  of  creative 
inteiTogation  will  result  in  accurate  accounts  of  memoiy  processes,  others  have  persisted  in 
their  attempts  to  inten-ogate  readers  directly  on  the  strategies  they  employed  or  thoughts 
they  had  during  reading.  This  approach  is  based  on  the  hope  that  memory  knowledge 
might  help  explain  memory  performance.  Both  open-ended  and  checklist  formats  of 
interviewing  have  been  used.  The  problem  with  this  tactic  is  highUghted  by  Gamer  et  al. 
(1983). 

Unfortunately,  we  know  that  subjects  report  using  behaviors  they  do  not 
demonstrate  using  (Brown  &  Lawton,  1977;  Gai-ner  &  Reis,  1981);  report 
using  strategies  they  apparently  think  they  should  be  using  (Baker,  1982); 
and  fail  to  report  obstacles  or  resolution  of  obstacles  apparently  deemed  too 
obvious  to  mention,  (p.  44) 

Meichenbaum  et  al.  (1985)  added  that  when  interviews  are  used  as  a  post- 

perfonnance  assessment  mechanism  there  is  the  added  interpretation  concern  of  whether 

subjects  ai'e  reporting  on  thoughts  they  had  during  the  reading  process  or  their  report  is  a 

post-performance  rationaUzation.  Think-aloud  techniques  and  interviews  conducted 

concurrently  with  the  reading  task  have  been  used  in  several  different  fonnats.  These 
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methods  provide  rich,  task-specific  introspections  and  avoid  concerns  that  the  reports  are 
post-performance  rationalization.  If  investigators  use  an  open-ended  interview  or  the 
reader  creates  written  protocols,  there  is  the  added  concern  that  what  is  being  measured  is 
not  a  by-product  of  the  reader's  "verbosity."  Intra-task  measures  are  highly  recommended 
(Baker  &  Brown,  1984b)  because  they  have  the  advantage  of  documentational  force. 
However,  Gamer  and  Anderson  (1982)  reminded  us  that  intra-task  measures  can  disrupt 
and  even  alter  the  nature  of  the  cognitive  processing. 

These  ai^e  just  a  few  of  the  methodological  issues  associated  with  using  what 
readers  can  declare  or  investigators  can  inteiTogate  as  data  sources.  Overall,  the 
metacognitive  knowledge-performance  relationships  have  been  disappointing  (Foirest- 
Pressley,  MacKinnon,  &  Waller,  1985;  Yussen,  1985).  Cavanaugh  and  Borkowski 
(1980)  found  that  a  positive  relationship  between  memory  knowledge  and  memory 
performance  disappeared  when  within-grade  correlations  were  considered.  Cavanaugh  and 
Perlmutter  (1982)  reviewed  approximately  a  dozen  studies  and  found  the  same 
developmental  trend  mentioned  earlier-older  and  more  able  readers  are  more  informed 
about  the  stable  characteristics  of  cognition  (self,  task,  strategies)  than  are  younger  readers. 
However,  the  magnitudes  of  these  correlations  were  low  or  moderate. 

A  declarative-knowledge  memory-performance  connection  may  well  exist.  It  is 
also  possible  that  there  is  an  executive  process  memory  perfomiance  connection. 
However,  this  relationship  will  not  be  discovered  with  reseai'ch  tactics  that  depend  on 
readers'  declai'ative  knowledge  as  the  only  source  of  data,  for  many  of  the  cognitive 
processes  and  perceptions  of  cognitive  readiness  exist  below  the  level  of  conscious 
awareness. 
Measuring  Executive  Pi'ocesses  Through  Decision-Making 

A  number  of  findings  support  the  contention  that  perceptions  of  cognitive  readiness 
occur  at  the  executive  level.  Baker  and  Anderson  (1982)  reported  that  college  students 
spend  more  time  reading  material  that  conflicted  with  material  previously  read.  One 
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interesting  highlight  was  that  although  they  adjust  their  reading  speeds,  they  can  not  report 
that  anything  was  wrong.  Flavell,  Speer,  Green,  and  August  (1981)  noticed  that  young 
children  in  a  listening  task  did  not  always  identify  a  flawed  message  but  frequently  showed 
signs  of  processing  difficulty  when  listening  to  the  message— body  language  and 
expressions  indicative  of  puzzlement.  These  finding  are  consistent  with  Sternberg's  (1985) 
observation  that  comprehension  monitoring  might  exist  below  the  level  of  conscious 
awareness.  If  there  is  a  connection  between  executive  processes  and  perceptions  of 
cognitive  readiness,  the  decision  becomes  how  to  externalize  these  events  so  they  can  be 
measured. 

Yussen  (1985)  and  Schneider  (1989)  noticed  that  in  Welbnan's  (1983)  review  of 
the  literature  there  were  a  group  of  studies  in  which  subjects  were  asked  to  make  decisions 
regarding  the  state  of  their  cognitive  readiness  (knowing  when  one  is  ready  to  recall).  The 
correlational  link  was  sti'onger  than  in  the  Cavanaugh  and  Perknutter  (1982)  review,  which 
focused  on  hypothetical  organizational  strategies  (declarative  knowledge).  Yussen 
acknowledged  that  the  appropriate  research  had  not  yet  accumulated  and  speculated,  that 
it  may  be  easier  to  find  evidence  for  reliable  metacognitive  phenomena  if  measures  are 
closely  linked  to  metacognitive  decisions  and  processes.  Yussen  also  suggested  that  the 
memory-knowledge/memory  performance  connection  had  failed  because 

The  things  that  researchers  ask  subjects  to  report  on  are  not  sufficiently 
omnipresent  ecologically  and/or  are  not  consistently  used  in  everyday 
endeavors.  By  contrast,  the  state  of  one's  cognitive  readiness  [emphasis 
added]  is  a  highly  salient  and  general  matter  to  be  reckoned  with  by 
everybody,  in  a  frequent,  routine  way.  (p.  262) 

Perceptions  of  cognitive  readiness  are  the  dependent  variables  in  this  project.  The 

function  of  the  executive  process  is  to  make  decisions  regarding  the  allocation  and 

implementation  of  individuals'  scarce  cognitive  resources.  Hence,  by  asking  readers  to 

make  decisions  regarding  their  cognitive  readiness,  this  study,  by  logical  necessity, 

measures  the  outcome  of  their  executive  processes. 
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Asking  readers  to  make  decisions  as  a  tactic  to  explore  executive  processes  and 
perceptions  of  cognitive  readiness  is  new  to  the  study  of  cognitive  monitoring.  Research 
findings  support  the  view  that  even  adult  skilled  readers  may  not  monitor  their  readiness  in 
optimal  fashion.  As  such,  this  study  seeks  to  discover  if  embedded  questions  can  bring 
readers'  subjective  assessments  of  cognitive  readiness  in  closer  correspondence  with  their 
objective  performance  (caUbration).   The  rationale  for  utilizing  embedded  questions  in  this 
manner  and  literature  relevant  to  this  procedure  will  be  outlined  in  the  next  section. 

The  Use  of  Questions  in  Cognitive  Monitoring  Reseai'ch 
Questioning  associated  with  academic  reading  has  been  the  subject  of  concerted 
scholarly  interest  since  the  early  seminal  work  of  Rothkopf  (1966).  Questions  inserted  into 
prose  material  with  the  intent  of  affecting  some  learning  outcome  have  been  collectively 
referred  to  as  embedded  or  adjunct  questions.  Embedded  questions  belong  to  a  family  of 
text-based  instructional  interventions  such  as  advanced  organizers,  bold  face  print,  and 
chapter  summaries  that  are  designed  as  environmental  manipulations  of  the  changeable 
components  of  reading  processes.  Their  intent  is  to  alter  a  reader's  normal  reading  and  text 
processing  behavior  to  biing  about  more  meaningful  processing.  Many  research  findings 
support  the  impact  of  adjunct  questions  in  influencing  retention  and  reading  comprehension 
when  the  questions  in  the  text  are  of  the  same  or  similar  semantic  level  to  those  found  on 
the  criterion  examination  (for  reviews  see  Anderson  &  Biddle,  1975;  Andre,  1979,  1987; 
Hamilton,  1985;  Melton,  1978;  Rickards,  1979;  Rickai'ds  &  Denner,  1979;  and  a  meta- 
analytic  review  by  Hamaker,  1986). 

Traditional  embedded  question  research  has  sought  to  understand  the  theoretical 
processes  by  which  these  questions  should  work.  Among  the  variables  of  interest  have 
been  question  types  and  infomiation  processing  changes.  By  manipulating  question 
formats  (verbatim,  semantic),  reseai'chers  demonstrated  that  they  could  influence  the  type 
of  cognitive  processing  engaged  in  by  readers  (Rickards,  1979).  Although  embedded 
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questions  have  been  used  to  study  strategic  behavior  during  reading  in  a  number  of 
different  settings  (Davey,  1987;  Gamer  &  Reis,  1981;  Spiers  &  Gallini,  1987;  Tobias, 
1987),  their  impact  on  cognitive  monitoring  and  perceptions  of  cognitive  readiness  has  not 
been  adequately  explored. 

Walczyk  and  HaU  (1989),  using  college  subjects  in  an  introductoiy  psychology 
class,  attempted  to  improve  caUbration  of  comprehension  by  adding  embedded  questions  to 
an  expository  text.  Their  rationale  was  that  embedded  questions  would  encourage  self- 
testing,  which  would  in-tum  improve  cognitive  monitoiing.  Walczyk  et  al.  were  unable  to 
find  a  main  effect  for  embedded  questions;  however,  there  were  only  20  subjects  in  each 
experimental  cell  and  the  experimental  task  consisted  of  less  than  5  pages  of  text  material. 
These  conditions  make  it  questionable  whether  this  smdy  was  powerful  enough  to 
constitute  an  adequate  test  of  the  facilitating  effects  of  embedded  questions. 

Pressley,  Snyder,  Levin,  Murray,  and  Ghatala  (1987)  also  investigated  the  impact 
of  embedded  questions  on  perceptions  of  cognitive  readiness.  They  asked  university 
students  about  their  perceptions  of  test  readiness  using  an  instrument  called  PREP 
(Perceived  Readiness  for  Examination  PerfoiTnance)  that  required  readers  to  make 
decisions  regarding  their  need  to  restudy.  The  logic  was  that  if  after  reading  an  expository 
text,  readers  decided  that  their  understanding  of  the  material  was  less  than  optimal,  the  most 
elementaiy  strategic  decision  they  could  make  was  to  reread  previous  material  (Gamer, 
1987).  PREP  quantified  this  strategic  decision  in  the  following  manner. 

There  were  four  probes,  asking  subjects  to  assess  the  need  to  reread  in 
order  to  get  20%,  40%,  60%,  and  80%  correct ....  Each  probe  was  stated 

as  follows:  Suppose  that  "passing"  on  the  test  were %.  Do  you  think 

that  you  would  need  to  reread  the  text  (that  is,  read  it  a  second  time)  in  order 

to  get %?  (Subjects  were  presented  a  five-point  scale  and  were  required 

to  pick  one  value  on  that  scale:  1  -  not  sure  at  all . .  .  5  =  absolutely 
certain,  no  doubt  about  the  choice),  (p.  224) 

This  approach  is  unique  in  many  respects.  Although  other  reseai'chers  asked 

readers  to  evaluate  their  memoiy  for  paiticulai"  pieces  of  text  material  (Glendberg  & 
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Epstein,  1985;  Glendberg  et  al.,  1982;  Maid  and  Berry,  1984),  Pi-essley  et  al.  (1987) 
asked  them  to  make  global  evaluations  of  theii"  readiness.  In  this  way,  readers  may 
perceive  that  they  know  little  about  material  and  need  to  restudy  without  being  able  to 
declare  just  exactly  what  it  is  they  do  not  know.  Such  an  approach  captures  the  executive 
processes  (resource  orchestration)  without  depending  on  readers'  knowledge  about 
knowledge.    Subjective  assessments  measured  in  the  PREP  instrument  were  compared 
with  actual  performance  to  produce  a  single  index  using  the  calculation  procedure  described 
earUer. 

Pi-essley  et  al.  (1987)  also  measured  reading  times  and  a  prediction  inaccuracy 
score,  which  was  calculated  by  taking  the  absolute  value  of  the  difference  between  the  test 
scores  that  readers  predicted  and  actually  obtained  on  the  criterion  examination.  In  the  first 
experiment,  subjects  were  familiarized  with  the  reading  material  and  test  by  scanning  a 
chapter  from  a  development  psychology  text  and  previewing  three  sample  multiple-choice 
questions.  The  experimental  task  consisted  of  12  pages  from  another  chapter  of  the  same 
text.  Subjects  were  allowed  to  read  the  material  at  tlieir  own  pace  but  could  read  it  only 
once  (not  allowed  to  look  back).  They  were  also  infoimed  that  they  would  be  tested  with 
factual  questions.  The  experiment  consisted  of  three  conditions:  One  group  of  subjects 
responded  to  the  PREP  instrument  before  reading,  another  group  responded  to  the  PREP 
instrument  after  reading  but  before  the  test,  and  a  thii'd  group  responded  to  the  PREP 
probes  after  they  had  taken  the  test.  To  establish  base  data  for  future  research,  the 
experiment  was  conducted  to  determine  how  much  PREP  would  increase  after  a  single 
reading  of  a  passage.  The  criterion  examination  consisted  of  50  multiple-choice  questions. 
The  only  statistical  difference  between  groups  on  the  prediction  inaccuracy  variable  and  the 
PREP  instrument  was  between  the  after-testing  versus  before-reading  groups.  The  after- 
reading  group  was  not  significantly  more  accurate  in  their  subjective  assessments  than  the 
before  reading  group  nor  significantly  less  accurate  than  the  after-testing  group. 
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Pressley  et  al.  (1987)  were  also  concerned  that  superior  PREP  scores  in  the  after- 
testing  gi-oup  may  have  been  due  to  increased  familiarity  with  characteristics  of  the  text  or 
test.  To  eliminate  this  possibility,  they  conducted  a  second  experiment  in  which  subjects 
read  a  chapter  and  took  an  examination  from  the  same  text  they  had  been  reading  as  part  of 
theii-  course  work,  hence  all  subjects  were  familiar  with  the  experimental  task  and  test.  In 
addition,  the  researchers  measured  two  individual  difference  constructs,  a  general  reading 
comprehension  measure  and  midtemi  examination  performance.  Subjects  made  the  same 
performance  predictions  but  were  also  asked  to  give  a  subjective  estimate  of  the  very 
highest  and  lowest  score  they  might  obtain,  a  prediction  range.  The  pattern  of  results  in 
this  experiment  were  nearly  identical  with  those  of  the  fu'st  expei-iment-prediction 
inaccuracy  and  PREP  improvements  were  detected  only  with  a  combination  of  reading  and 
test  taking  experience.  There  were  slight  reading  time  differences  between  groups,  but  an 
analysis  of  covaiiance  was  conducted  and  the  relationship  between  contrasts  did  not 
)  change.  The  mean  prediction  range  of  the  after-testing  group  was  statistically  narrower 

than  that  of  either  before-reading  or  after  reading  subjects.  Neither  the  reading 
comprehension  measure  nor  the  midterm  score  interacted  with  treatment  conditions,  nor  did 
the  relationship  between  treatment  conditions  change  when  student  abilities  were 
controlled. 

A  third  expeiiment  attempted  to  increase  the  accuracy  of  subjective  assessments 
during  reading,  and  the  researchers  hoped  to  do  this  by  embedding  questions  within  the 
text  itself.  The  logic  of  tliis  decision  was  explained  as  follows. 

Even  though  actual  learning  benefits  associated  with  inserted  questions  are 
often  modest  and  circumscribed  (e.g.,  Anderson  &  Biddle,  1975;  Hamaker, 
1986),  having  to  respond  to  such  questions  might  make  obvious  whether 
what  is  being  read  is  being  remembered,  a  prediction  that  is  consistent  with 
contemporaiy  theoretical  models  of  text-questioning,  (p.  229) 

In  this  experiment  there  were  three  questioning  conditions-no  questions,  a  massed 

condition,  and  an  interspersed  condition-and  they  were  completely  crossed  with  the 
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before-reading,  after-reading,  and  after-testing  conditions.  In  the  massed  condition,  15 
factual  questions  were  placed  at  the  end  of  the  reading  material.  In  the  interspersed 
condition,  5  sets  of  3  questions  were  distributed  at  natural  breaks  in  the  text.  Reading 
times,  PREP,  prediction  inaccuracy,  and  prediction  range  measures  were  taken. 

Reading  times  in  interspersed  questions/after-testing  condition  were  significantly 
larger  than  the  no  question/after-testing  condition.  No  other  reading  time  comparisons 
were  significant,  and  groups  did  not  differ  on  criterion  test  performance.  In  both 
embedded  question  conditions,  prediction  inaccuracy  in  the  after-reading  estimates  was 
significantly  better  than  in  the  before-reading  estimates.  After-reading  predictions  did  not 
differ  from  after-testing  estimates  in  the  interspersed  questioning  condition.  It  is  likely  that 
answering  questions  on  the  test  provided  feedback  to  the  reader  about  their  state  of 
cognitive  readiness.  Within  the  after-reading  conditions  the  interspersed  gi'oup  was 
significantly  more  accurate  than  the  no-question  group  and  shghtly  better  than  the  massed- 
question  condition.  In  the  interspersed  conditions,  the  prediction  range  for  after-reading 
and  after-testing  groups  was  significantly  narrower  than  for  the  before-reading  group.  In 
the  previous  two  expeiiments,  the  only  significant  PREP  difference  was  between  the  after- 
testing  and  before-reading  groups.  In  this  thu-d  experiment,  PREP  was  significantly  more 
accurate  after-reading  than  before-reading  and  as  accurate  after-reading  as  after-testing. 

The  findings  in  Pressley  et  al.  (1987)  make  a  significant  contribution  to  the  study  of 
metacognition  and  support  the  view  that  embedded  questions  can  improve  perceptions  of 
cognitive  readiness.  By  focusing  on  metacognitive  decision  making  (restudy  decisions  via 
PREP,  prediction  accuracy,  and  range),  the  researchers  were  able  to  capture  readers' 
metacognitive  knowledge  (self,  task,  strategies)  and  executive  processes.  The  PREP 
instrument  is  particularly  well  suited  as  an  outcome  measure  of  executive  processes-the 
PREP  index  fonns  the  basis  for  a  calibration  score  that  is  based  directly  upon  the  perceived 
need  for  strategic  remedial  behavior.  This  assessment  is  based  on  readers'  global 
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evaluation  of  readiness  rather  than  on  techniques  that  asks  them  to  make  assessments  about 
particular  elements  of  content. 

This  study  was  designed  as  an  extension  to,  and  adaptation  of,  the  third  experiment 
of  Pressley  et  al.  (1987).  Because  they  successfully  demonstrated  that  embedded  questions 
could  improve  readers'  perceptions  of  cognitive  readiness,  it  would  be  valuable  to  increase 
the  ecological  validity  of  the  experimental  conditions.  The  present  study  added  conditions 
that  will  make  it  easier  to  generahze  findings  to  more  practical  and  relevant  college  reading 
situations  and  tasks. 

Subjects  in  the  Pressley  et  al.  experiments  could  read  the  text  only  once  and  were 
directiy  prohibited  from  looking  back  to  previous  material.  The  researchers  described  this 
decision. 

This  study  tapped  the  effects  of  questioning  on  PREP  rather  than  question- 
induced  restudy  of  materials. . .  In  doing  so,  the  procedures  used  here 
mirror  a  typical  adjunct-questioning  situation:  Looking  back  is  often 
prevented  when  readers  process  adjunct  questions,  (p.  230) 

In  the  present  study,  the  effects  of  embedded  questions  on  PREP  were  once  again 

evaluated  with  a  no-lookback  condition.  However,  in  order  to  test  the  combined  effect  of 

embedded  questions  and  the  results  of  being  able  to  reaccess  (restudy)  previous  material,  a 

lookback  condition  was  added.  Although  the  no  lookback  condition  has  the  advantage  of 

experimental  conti-ol  (measuring  one  influence  only),  there  is  serious  concern  about  the 

ecological  vahdity  of  such  an  approach  (Duchastel,  1983).  Gamer  (1987)  described 

lookbacks  as  a  strategic  reader  option. 

The  text  reinspection  strategy  is  another  obvious,  adaptive  strategy  with 
particularly  high  benefits  in  school  situations.  The  sti-ategy  involves  (a) 
noting  memory  limitations  (i.e.,  that  information  once  read  is  not  now 
remembered);  and  (b)  intentional  reaccessing  of  portions  of  text  that  provide 
the  information.  The  strategy  capitalizes  on  the  permanence  of  print,  an 
advantage  written  language  offers  over  spoken  language  to  learners  who  are 
aware  of  the  distinction.  It  may  well  be  that  this  strategy,  and  others  like  it 
that  help  overcome  the  capacity  limitations  of  working  memory,  and  are 
among  the  most  impoitant  routines  learners  acquire  to  assist  them  in 
studying  practical  problems  in  natural  settings,  (p.  52) 
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The  present  study  is  interested  in  whether  embedded  questions  change  perceptions 
of  cognitive  readiness,  but  perhaps  a  more  pragmatic  concern  is  what  effects  these 
questions  have  when  readers  are  allowed  to  engage  in  more  normal  reading  behavior. 

Another  difference  between  the  work  of  Pressley  et  al.  and  the  present  study 
concerns  the  cognitive  level  of  the  embedded  questions  and  criterion  examination.  Pressley 
et  al.  focused  on  factual  infonnation  because  it  "seemed  more  straightforward  for  an  initial 
investigation  of  PREP  than  did  studying  PREP  for  implicit  content"  (p.  222).  The 
embedded  questions  and  criterion  test  both  followed  a  fill-in-the-blank  fomiat.  It  could  be 
argued,  however,  that  the  effects  of  both  higher-order  embedded  questions  and  criterion 
test  items  would  be  more  interesting  and  more  appropriate  for  college-aged  subjects. 
Higher-order  questions  "ask  the  student  to  mentally  manipulate  bits  of  infonnation 
previously  learned  to  create  an  answer,  or  to  support  an  answer  with  logically  reasoned 
evidence"  (Hamaker,  1986,  p.  213). 

A  salient  difference  between  Pressley  et  al.  and  the  present  study  involves  the  type 
of  orientation  insti-uctions  subjects  received  regarding  embedded  questions.  Pressley  et  al. 
informed  subjects  that  they  would  be  encountering  questions  in  the  text  and  that  they 
should  try  their  best  to  answer  them  "in  their  heads"  (p.  230).  Although  this  a  tactical 
improvement  over  more  traditional  questioning  research  that  often  requires  readers  to 
produce  a  written  response  for  each  question  (Anderson  &  Biddle,  1975;  Hamaker,  1986), 
there  remains  some  question  as  to  the  extent  that  findings  from  such  experimental 
conditions  can  be  generalized  to  normal  reading  settings-reading  for  remembering  is 
predominantly  a  learner-controlled  process  that  is  generally  not  under  the  observation, 
measurement,  or  control  of  experimenter  or  teacher  (Thomas  &  Rohwer,  1986).  Cueing 
readers  to  anticipate  questions,  added  with  explicit  instiiictions  to  answer  them,  may 
produce  lai-ger  effects  but  may  interfere  with  nornial  reading  patterns  and,  conceivably, 
create  an  unwanted  form  of  "conscious  cognition"  (Flavell,  1981).  That  is  to  say,  the 
effects  of  embedded  questions  on  perceptions  of  cognitive  readiness  will  be  confounded 
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with  the  effects  of  embedded  questions  and  the  instructions  that  accompany  them.  In  the 
present  study,  subjects  were  not  cued  to  the  presence  of  embedded  questions  or  asked  to 
answer  them.  This  experimental  condition  will  test  the  effects  of  embedded  questions 
under  more  ecologically  valid  settings  because  ultimately  the  effects  of  embedded  questions 
on  cognitive  monitoring  depend  on  what  the  reader  does  with  them.  Embedded  questions 
have  no  effect  on  readers  who  do  not  use  them,  and  readers  cannot  be  effectively  coerced  to 
read  them,  let  alone  demand  that  they  answer  them  or  encode  material  in  a  highly  semantic 
fashion.  Garner  and  Alexander  (1982)  ai-gued  a  similar  point. 

With  summarization  and  rereading  sti-ategies-or  any  other  cognitive 
strategies-an  important  aspect  of  the  investigation  is  where  on  the 
continuum  from  wholly  learner-produced  (Wienstein,  Underwood,  Wicker, 
&  Cubberly,  1979)  to  wholly  investigator-induced  they  sit.  Spontaneous 
demonstration  of  a  strategy,  it  can  be  argued,  implies  potential  use  by  the 
learner  in  the  nonexperimental  setting,  (p.  144) 

Finally,  to  detennine  if  the  effects  of  embedded  questions  are  vulnerable  to 

individual  differences,  verbal-educational  (V-3)  and  visualization  (Vz-2)  measures  were 

also  included.  The  V-3  measure,  as  a  proxy  of  crystallized  general  abihties  (Cattell,  1963), 

predicts  performance  in  a  wide  variety  of  situations  (Hunt,  1980).  In  addition  to  being 

verbal,  the  experimental  task  in  this  study  requked  readers  to  manipulate  spatial 

information.  The  Vz-2  measure,  as  a  proxy  of  fluid  intelligence,  was  used  to  assess  this 

spatial  ability  (CaiToU,  1974;  Shepard  &  Feng,  1972).  Together,  these  instruments  will 

help  determine  if  the  effects  of  embedded  questions  on  perceptions  of  cognitive  readiness 

can  be  generalized  across  a  wide  aptitude  range.  Ultimately,  what  is  needed  is  to  find  out 

why  some  adults  ai'e  more  able  regulators  of  their  cognition  than  others,  for  as  Schommer 

(1990)  reminded  us,  "The  concept  of  metacognition  does  not  explain  why  some  students 

fail  to  monitor  then*  comprehension"  (p.  503). 
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Summaiy 
Undetected  cognitive  failure  is  a  problem  well  documented  with  young  and  less 
capable  readers.  Recently,  researchers  have  noted  that  even  adult  skilled  readers  may  not 
monitor  their  cognition  in  an  optimal  fashion,  but  externalizing  the  complex  mental  events 
associated  with  cognitive  monitoring  and  perceptions  of  cognitive  readiness  has  been 
difficult.  Many  researchers  have  attempted  to  understand  these  mechanisms  by  trying  to 
make  a  connection  between  what  individuals  can  declare  about  the  workings  of  their 
memory  and  subsequent  memoiy  perfomiance.  Results  have  been  disappointing.  There  is 
serious  doubt  that  readers  can  give  accurate  accounts  of  theii*  memory  processes  because 
many  of  the  cognitive  processes  associated  with  memory  monitoring,  especially  those 
associated  with  perceptions  of  cognitive  readiness,  ai'e  executive  processes.    They  are 
mental  events  not  always  available  to  conscious  awareness  and,  therefore,  not  easily 
reportable.  This  study  proposes  to  measure  the  impact  of  embedded  questions  on 
perceptions  of  cognitive  readiness  by  asking  readers  to  make  comparative  judgments 
regarding  their  state  of  cognitive  readiness,  a  tactic  that  captures  the  cognitive  events 
associated  with  the  readers'  executive  processes. 

Utilizing  an  instrument  designed  to  measure  readers'  perceived  need  for  remedial 
behavior,  Pressley  et  al.  (1987)  found  that  embedded  questions  improved  the 
coiTespondence  between  readers'  subjective  assessments  of  readiness  and  objective 
measures  of  readiness,  or  calibration.  The  present  study  sought  to  determine  if  these 
findings  would  hold  under  conditions  more  similar  to  academic  reading  situations 
experienced  by  adult  learners  and  added  conditions  that  will  make  it  easier  to  generalize 
findings  to  more  practical  and  relevant  college  reading  situations  and  tasks. 
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Hypotheses 

Ten  hypotheses  were  tested  in  the  present  study.  The  variables  used  in  these 
hypotheses  are  operationaUzed  and  described  in  the  following  section. 
Prediction  Measures  of  Calibration 

The  central  measure  in  this  study  was  the  perceived  readiness  for  examination 

performance  (PREP),  as  conceived  by  Pressley  et  al.  (1987),  which  was  operationaUzed  by 

asking  readers  to  provide  judgments  of  their  perceived  need  to  reread  (would  need  to 

reread/would  not  need  to  reread)  if  the  criteiia  for  passing  the  examination  were  20,  40,  60, 

or  80%.  Readers  also  indicated  how  confident  they  were  in  their  would/would  not 

decisions  on  a  5-point  Likert  scale  from  1  (low  confidence)  to  5  (high  confidence)  to  each 

of  the  4  probes.  For  example,  consider  a  hypothetical  reader  who  scored  70%  on  the 

criterion  examination  and  responded  to  the  4  probes  as  follows: 

Would/would  not      Confidence 
Probe need  to  reread  Score How  Scored 

20%  Would  not  5  +5 

40%  Would  not  2  +2 

60%  Would  2  -2 

80%  Would  4  +4 

The  PREP  score  for  this  reader  would  be  the  sum  of(+5,  +  2,  -2,  +4),  or  9.    Using  this 

numerical  procedure  PREP  scores  can  range  between  -20  and  +20.  Correspondence 

between  these  judgments  and  subsequent  test  performance  forms  the  basis  for  a  calibration 

score  that  is  based  directly  on  the  perceived  need  for  sti'ategic  remedial  behavior.  Main 

effects  were  hypothesized  for  both  embedded  questions  and  the  lookback  factor  on  the 

PREP  score: 

1 .  The  mean  PREP  score  for  the  embedded  question  groups  will  be 

significantly  larger  than  the  mean  PREP  score  for  the  no-embedded  question  groups  at  p  < 

.05. 
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2,  The  mean  PREP  score  for  the  lookback  groups  will  be  significantly  larger 
than  the  mean  PREP  score  for  the  no-lookback  groups  atp  <  .05. 

Immediately  after  leaving  the  reading  portion  of  the  experiment,  participants 

encountered  the  following  probe: 

In  the  next  minute  or  so  you  will  be  taking  a  test  over  the  material  you  have 
just  studied.  In  this  test  you  will  be  asked  to  make  predictions  about  the 
movement  of  Xenograde  systems  based  upon  the  principles  you  have 
studied.  What  is  your  best  estimate  of  how  well  you  will  do  (expressed  as 
percentage  coixect)  on  this  examination? 

The  prediction  inaccuracy  (PI)  variable  was  operationahzed  by  taking  the  absolute 

difference  between  the  readers'  subjective  guess  and  their  objective  performance  on  the 

criterion  examination.    Main  effects  were  hypothesized  for  both  embedded  questions  and 

the  lookback  factor  on  the  PI  score: 

3,  The  mean  PI  score  for  the  embedded  question  groups  will  be  significantly 
lai-ger  than  the  mean  PI  score  for  the  no-embedded  question  groups  at  p  <  .05. 

4,  The  mean  PI  score  for  the  lookback  groups  will  be  significantly  lai-ger  than 
the  mean  PI  score  for  the  no-lookback  groups  at  p  <  .05. 

Postdiction  Measure  of  Calibration 

For  each  item  on  the  criterion  test,  subjects  were  asked  to  judge  if  they  beUeved 
they  got  the  item  correct  or  incorrect,  (yes/no).  They  were  then  asked  to  rate  the  confidence 
they  had  in  thek  (yes/no)  choice  using  a  6-point  Likert  scale.  Requiring  readers  to  make  a 
bifurcated  judgement  about  whether  they  answered  the  item  correctly  is  a  change  from  more 
traditional  postdiction  measures  where  respondents  only  indicate  their  confidence  that  an 
item  is  coixect.  This  revision  allows  for  a  more  direct  assessment  of  readers  who  both 
know  when  they  know,  and  equally  important,  know  when  they  do  not  know.  To 
accommodate  this  change  the  numerator  on  the  CAQ  measure  was  modified  and  the 
following  adaptation  was  used  to  test  for  gi"oup  mean  differences  on  postdiction  cahbration. 
To  distinguish  this  measure  from  others,  the  acronym  had  been  changed  to  PJC, 
(postdiction  judgement  of  calibration). 
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Average  confidence  for     -    Average  confidence  for 

PJC  =  items  judged  coiTectly items  judged  incorrectly 

Squai-e  of  variance  for  all  items 

Main  effects  were  hypothesized  for  both  embedded  questions  and  the  lookback  factor  on 
the  PJC  score: 

5 .  The  mean  PJC  score  for  the  embedded  question  groups  will  be  significantly 
larger  than  the  mean  PI  score  for  the  no-embedded  question  groups  at  p  <  .05. 

6 .  The  mean  PJC  score  for  the  lookback  groups  will  be  significantly  larger 
than  the  mean  PI  score  for  the  no-lookback  groups  at  p  <  .05. 

Individual  Difference  Measures 

Two  intellectual  abihty  measures  were  taken  prior  to  the  beginning  of  die 
experiment--a  measure  of  verbal-educational  abihty  and  a  visualization  measure.  These 
measures  were  taken  from  the  Kit  of  Factor-Referenced  Cognitive  Tests  (Ekstrom  et  al., 
1976)  and  are  identified  as  V3  (extended  verbal  ability)  and  Vz-2  (paper  folding  test) 
respectively.  In  earlier  pilot  testing  both  measures  were  found  to  con-elate  with 
performance  on  the  Xenograde  test  and  the  pattern  of  the  correlations  depended  on  gi-oup 
membership.  In  this  study  it  was  anticipated  that  embedded  questions  would  best  serve 
those  readers  low  in  verbal  and  visualization  abihty.  This  form  of  aptitude-ti-eattnent 
matching  has  been  referred  to  as  the  "compensatory  model"  (Salomon,  1972)  and  explained 
by  Cronbach  and  Snow  (1977)  as  a  treatment  that  can  do  for  some  learners  what  they 
cannot  do  for  themselves.  In  the  hterature  discussed  in  the  previous  chapter,  researchers 
found  that  many  poor  readers  had  very  high  confidence  in  their  wrong  answers 
(Shaughnessy,  1981).  In  the  present  study,  questions  were  embedded  in  the  text  to 
encourage  readers  to  question  and  test  their  state  of  cognitive  readiness  where  they  may  not 
have  done  so  on  their  own.  The  following  interaction  hypotheses  were  tested  for  both  the 
embedded  question  and  lookback  factor: 
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7.  There  will  be  a  significant  interaction  between  the  embedded  and  no-embedded 
question  groups  on  each  of  the  three  dependent  measures  (examination  performance,  PJC, 
and  PREP)  at  p  <  .05,  with  verbal -educational  ability. 

8.  There  will  be  a  significant  interaction  between  the  embedded  and  no-embedded 
question  groups  on  each  of  the  four  dependent  measures  (examination  performance,  PJC, 
PREP,  and  PI)  at  g  <  .05,  with  visualization. 

9.  There  will  be  a  significant  interaction  between  the  lookback  and  no-lookback 
groups  on  each  of  the  three  dependent  measures  (examination  performance,  PJC,  PREP, 
and  PI)  at  p  <  .05,  with  verbal-educational  ability. 

10.  Tliere  will  be  a  significant  interaction  between  the  lookback  and  no-lookback 
gi-oups  on  each  of  the  thi-ee  dependent  measures  (examination  performance,  PJC,  PREP, 
and  PI)  at  p  <  .05,  with  visuahzation. 

The  variables  PREP  and  PJC  have  been  used  and  defined  by  other  researchers  as 
pre-  and  post-diction  calibration  measures  of  test  readiness.  However,  in  the  probes 
associated  with  these  two  measures,  readers  ai-e  asked  to  make  both  categorical  judgments 
(yes/no  rereading  is  necessary  in  the  PREP  measure,  and  yes/no  the  answer  chosen  was 
correct  in  the  PJC  measure)  and  confidence  decisions-their  confidence  that  their  categorical 
decision  was  connect.  When,  as  has  been  done  in  the  present  study,  caUbration  is  defined 
as  the  correlation  between  subjective  judgments  of  performance  and  actual  performance, 
then,  caUbration  is  most  appropriately  a  measure  of  the  correspondence  between  the 
categorical  decisions  and  objective  perfomiance;  confidence  judgments  are  an  auxiliary 
consideration. 

As  pait  of  the  supplementary  analysis,  a  dependent  measure  of  calibration  was 
created  using  the  categorical  portion  of  the  vaiiable  only,  and  results  from  this  variable 
were  compared  with  results  from  the  four  hypotheses  using  the  PREP  and  PJC  measure. 
The  influence  of  these  two  source  of  variance  (calibration  and  confidence)  could  then  be 
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evaluated  independently.  As  a  predictor  of  calibration,  it  was  anticipated  that  categorical 
decisions  alone  would  be  as  accui-ate  as  categorical  decisions  and  confidence  estimates 
combined. 


CHAPTER  3 
METHODS  AND  PROCEDURES 


This  experiment  was  prepared  as  a  two-factor  design,  embedded  questions  (yes/no) 
and  text  reinspection  (yes/no).  The  purpose  of  the  lookback  factor  was  to  separate  the 
effects  of  embedded  questions  on  perceptions  of  cognitive  readiness  when  combined  with 
re-study  decisions  from  the  effects  of  embedded  questions  when  re-study  was  prohibited. 
The  factors  were  crossed,  producing  four  experimental  conditions-embedded  questions 
with  lookbacks  allowed  (EQLB),  embedded  questions  with  no-lookbacks  (EQNLB),  no- 
embedded  questions  with  lookbacks  allowed  (NEQLB),  and  no-embedded  questions  with 
no-lookbacks  (NEQNLB).  Each  factor  is  a  between  subjects  factor  because  different 
subjects  appear  in  each  cell  of  the  design  and  no  matching  was  undertaken.  The 
expeiimental  task  for  all  gi'oups  was  to  read  and  study  a  text  called  Xenograde  Science 
(Merrill,  1965),  which  was  administered  through  an  Apple  Macintosh®  computer  using  the 
HyperTalk^"^  programming  language. 

Sample  Selection  and  Participants 
Participants  in  the  study  were  recruited  from  nine  different  courses  being  taught  at 
the  University  of  Florida,  representing  a  potential  subject  pool  of  approximately  300  adult 
skilled  readers.  Each  of  these  classes  were  part  of  the  junior-  and  senior-level  courses 
being  offered  by  the  College  of  Education.  Participation  in  the  study  was  voluntary  and 
data  collection  was  terminated  when  all  students  from  the  nine  courses  who  wished  to  be 
part  of  the  study  had  the  opportunity  to  do  so.  The  final  sample  was  168  students. 
Instrtictors  offered  students  incentive  to  participate  in  the  study  in  the  form  of  extra  points 
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toward  course  grade.  Bias  stemming  from  recraiting  subjects  from  different  classes  under 
different  incentive  conditions  was  minimized  thi-ough  random  assignment  to  experimental 
condition. 

Pilot  Studies 

Two  pilot  studies  were  conducted.  One-hundred  and  sixty- four  (164)  student 
volunteers  from  the  accessible  population  described  above  constituted  a  sample  for  the  first 
pilot  and  were  randomly  assigned  to  one  of  the  four  treatment  conditions.  In  the  second 
pilot,  62  students  similarly  selected  were  randomly  assigned  to  either  an  EQLB  or  NEQLB 
gi'oup.  Findings  from  these  two  pilots  were  used  to  guide  the  present  study. 

Unlike  in  the  present  study,  paiticipants  in  the  first  pilot  were  allowed  to  take  as 
much  time  as  necessary  to  learn  the  experimental  task.  The  computer  kept  a  record  of  the 
number  of  seconds  they  spent  on  each  frame  of  text  as  well  as  text  reinspection  behavior  in 
the  two  lookback  conditions.  Mean  reading  times  were  longer  for  groups  encountering 
embedded  questions  and  for  groups  allowed  to  look  back  to  previous  material.  For  all 
groups,  the  mean  reading  time  was  slightly  less  than  20  minutes.  Other  than  the  aggregate 
group  differences  reported  here  (differences  in  reading  times),  the  variable  time  was 
completely  uninfonnative  with  respect  to  other  dependent  measures.  In  the  second  pilot,  a 
paper-and-pencil  version  of  the  Xenograde  material  was  administered  to  two  lookback 
conditions,  an  embedded  question  group  (EQ)  and  a  no-embedded  question  group  (NEQ). 
The  study  time  in  this  pilot  was  limited  to  22  minutes. 

In  both  pilots,  readers  were  asked  to  give  two  types  of  comparative  judgements, 
one  about  themselves  relative  to  their  peers  and  another  about  the  difficulty  of  the  task 
compai-ed  with  other  materials  they  were  learning.  These  questions  were  administered  after 
studying  the  Xenograde  text  and  before  taking  the  test.  In  both  pilots,  the  relationship 
between  evaluations  of  self  and  number  correct  on  the  test  and  task  and  number  of  items 
coirect  was  stronger  in  the  EQ  groups  than  in  the  comparison  conditions.  Tliese  data  were 
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taken  as  preliminary  evidence  that  readers  who  encountered  embedded  questions  developed 
better  prediction  cahbration-they  had  more  accurate  perceptions  regarding  their  own  abihty 
and  task  difficulty  when  compai'ed  with  a  no-question  control  group.  The  same  embedded 
questions  that  were  used  in  these  pilots  were  used  in  the  present  study. 

Participants  in  the  first  pilot  were  also  asked  to  indicate  the  confidence  they  had  in 
the  answers  they  picked  on  an  18-item  criterion  test.  Differences  in  prediction  accuracy 
were  measured  using  a  technique  developed  by  Zimmennan  et  al.  (1977).  This  measure  is 
called  the  confidence-judgement  accuracy  quotient  (CAQ)  and  is  analogous  to  a  signal 
detection  analysis.  It  is  calculated  as  follows: 

Average  confidence  for   -       Average  confidence  for 

CAQ=  items  correct items  judged  wrong 

Square  of  vaiiance  for  all  items 

This  statistic  was  used  to  calculate  a  postdiction  calibration  score  for  each  subject.  When 

data  were  collapsed  across  the  lookback  condition,  readers  who  had  encountered  the 

embedded  questions  were  better  calibrated  at  the  time  of  testing,  F(l,160)  =  4.2,  p  <  .05. 

An  item  analysis  was  performed  on  the  criterion  test  of  the  first  pilot  study.  Using 

the  Kuder  Richardson  20  procedure  the  reliabihty  of  this  test  was  calculated  as  r  =  .69. 

From  this  data  a  new  test  was  created  for  the  puipose  of  increasing  reliability. 


Instrumentation 
Materials 

Xenograde  Science  (Merrill,  1965)  is  a  hierarchical  (concepts  build  upon  each 
other)  learning  set  that  describes  the  physics  of  a  make-believe  solar  system.  A  contrived 
task  was  used  for  two  reasons:  (a)  to  minimize  the  effects  of  domain  specific  prior 
knowledge  and  (b)  because  novel  material  has  been  reported  to  produce  more  conscious. 
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analytical  processing  than  material  more  familial-  to  subjects  (Hare  &  Smith,  1982).  Earlier 
pilot  testing  revealed  that  it  is  of  sufficient  difficulty  for  the  target  population  of  this  study. 

The  Xenograde  task  consisted  of  20  frames  of  text  totalling  approximately  1700 
words.  There  were  19  diagrams  and  2  tables.  In  the  two  EQ  conditions,  after  every  fifth 
frame  of  text  a  frame  containing  the  embedded  questions  was  included.  There  were  18 
distinct  questions  in  these  five  frames,  adding  approximately  280  words  to  the  task.  The 
furst  embedded  question  asked  about  the  nomenclature  of  the  Xenograde  solar  system;  other 
questions  asked  readers  to  make  predictions  based  on  principles  they  should  have  learned 
in  their  study  of  Xenograde  Science.  The  text  was  presented  one  page  at  a  time  on  an 
Apple  Macintosh®  computer  using  the  HyperTalk  ^'^  progi-amming  language.  Readers 
controlled  page  turning  by  pressing  designated  keys  on  the  keyboard.  A  paper  version  of 
materials  representing  an  EQ  condition  is  included  in  Appendix  B. 
Test 

A  criterion  test  consisting  of  19  multiple-choice  questions  was  administered.  The 
first  question  dealt  with  recognition  of  Xenograde  nomenclature;  all  other  questions 
required  subjects  to  apply  the  principles  they  had  learned  while  reading  the  Xenogi'ade 
Science  material.  A  paper  version  of  this  test  is  included  in  Appendix  D. 
Design 

A  2  X  2  factorial  design  was  used:  Two  factors,  each  with  two  levels  (embedded 
questions  vs.  no  embedded  questions  and  lookbacks  allowed  vs.  lookbacks  not  allowed), 
were  crossed.  The  two  manipulated  factors  of  this  study  and  the  corresponding 
experimental  groups  are  illustrated  below. 

Lookbacks  Allowed 
Yes  No 


Embedded  Yes  (EQLB) 

Questions  

Present 

No  (NEQLB) 


(EQNLB) 


(NEQNLB) 
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Procedures 

Before  each  expeiiment  began,  the  visuahzation  (Vz-2)  and  extended  verbal  (V3) 
measures  were  administered.  Once  the  experiment  began,  subjects  were  read  a  common 
script  introducing  them  to  the  task  and  expeiimental  objectives  (see  Appendix  A).  Subjects 
picked  a  unique  identification  number  from  a  box  of  numbers  that  were  generated  for  that 
pui-pose.  Their  attention  was  then  directed  to  the  first  frame  of  the  computerized  textbook 
and  they  were  instructed  to  proceed  (see  Appendix  B).  These  instructions  repeated  some  of 
the  experimental  objectives  and  informed  participants  of  their  rights  as  human  subjects.  On 
the  seventh  page  of  these  instructions,  subjects  were  requked  to  enter  into  the  computer 
their  unique  identification  number  and  were  thus  routed  to  the  appropriate  experimental 
condition. 

After  23  minutes,  or  when  participants  had  determined  they  had  studied  the  material 
adequately  (whatever  condition  occurred  fu^st),  the  computer  routed  them  to  the 
questionnaire  portion  of  the  study  that  measured  PI  and  PREP  (see  Appendix  C).  Subjects 
responded  to  the  concurrent  questions  as  they  were  presented,  and  theh"  answers  were 
recorded  on  a  remote  text  file  in  the  computer.  After  finishing  the  questionnaire,  the 
computer  routed  them  to  the  Xenograde  test.  Upon  completing  the  test,  all  participants 
were  thanked  and  dismissed. 

The  decision  to  Umit  the  amount  of  time  that  subjects  could  read  and  study  the 
Xenograde  text  was  partially  motivated  by  the  need  for  experimental  control  of  potentially 
mediating  variables.  Moreover,  traditional  educational  settings  often  must  operate  under 
the  same  consti-aint-time  allocated  to  learn  a  given  quantity  of  material  is  generally  finite 
(Chronbach  &  Snow,  1977). 
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Experimental  Setting 
The  experiment  was  conducted  in  a  quiet,  well-lit  office  in  the  College  of  Education 
at  the  University  of  Elorida.  This  experimenter,  who  was  blind  to  group  membership, 
supei-vised  eveiy  session.  A  schedule  had  been  previously  prepared  that  included  each  of 
the  50-minute  day-time  class  periods  as  one  experimental  time  block.  Subjects  were 
infoimed  to  sign  their  name  in  the  block  most  convenient  to  them  and  asked  to  attend  that 
session.  To  minimize  internal  validity  threats  (Campbell  &  Stanley,  1966)  subjects  were 
explicitly  asked  not  to  discuss  the  experiment. 

Data  Analysis 

Of  the  three  dependent  variables  (PJC,  PI,  and  PREP)  PI  is  a  ratio  level  number, 
whereas  PJC  and  PREP  are  both  interval  measures.  Group  membership  (EQLB,  NEQLB, 
EQNLB,  NEQNLB)  is,  of  course,  categorical  in  nature  and  represents  the  manipulated 
independent  variables  in  the  study.  The  two  intellectual  measures,  V3  and  Vz-2,  are  both 
interval  level  numbers  and  were  used  as  covariates  to  test  for  aptitude-ti-eatment 
interactions. 

The  ten  research  hypotheses  were  grouped  into  four  conceptually  distinct 
experimental  units  corresponding  to  the  four  dependent  vaiiables  under  investigation 
(PREP,  PI,  PJC,  and  the  two  intellectual  abilities).  Each  test  of  main  effects  was 
proceeded  by  an  examination  of  the  interaction  hypothesis.  If  there  was  an  interaction 
between  the  two  factors  (embedded  questions  and  lookback)  then  pairwise  comparisons  of 
cell  means  were  conducted.  If  no  interactions  were  found  then  a  two-way  analysis  of 
variance  (ANOV  A)  was  conducted  to  test  for  the  presence  of  main  effects.  The  Brown- 
Forsythe  ANOVA  procedure  will  be  used  if  violations  to  the  homoscedasticity  assumption 
appear  to  be  a  problem. 

In  the  last  conceptual  unit  an  investigation  was  undertaken  to  determine  if  there 
were  any  interactions  between  the  two  levels  of  two  factors  (EQ  vs.  NEQ  and  LB  vs. 
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NLB)  when  either  of  two  intellectual  measures  (Vz  and  V3)  were  controlled  when 
predicting  individual  differences  in  the  three  key  dependent  measures  (examination 
performance,  PREP,  and  PJC).  This  conceptual  unit  is  composed  of  12  interaction  tests 
using  the  general  form  of  the  regression  model  specified  below. 

Y(Dep.  Variable)  =  OC  +  plXl(Group  Membership)  +  p2X2(iQtellectual  Measure) 
+   p3XlX2(i]jteraction  Parameter)  +  £• 

In  each  model,  the  interaction  parameter  was  tested  for  statistical  significance.  The 
familywise  error  rate  was  controlled  within  this  conceptual  unit.  Because  effects  were 
expected  in  each  of  the  hypothesis  outlined,  a  dii-ectional  alternative  was  used  in  each  test  of 
significance. 

Limitations  and  Delimitations 
The  piincipal  limitation  of  this  stody  is  that  it  is  based  on  an  experimentally 
engineered  learning  event.  The  objective,  however,  is  to  generalize  from  findings  in  this 
research  to  the  types  of  reading  and  studying  behavior  more  typically  engaged  in  by  adult 
learners.  Generalizing  from  this  study  to  nonexpeiimental  situations  seems  warranted  for 
two  reasons:  First,  readers  in  this  study  were  not  cued  to  process  the  questions  in  any 
fashion,  and  therefore  effects  discovered  were  learner-produced  rather  than  investigator- 
induced;  second,  perceptions  of  cognitive  readiness  are  presumed  to  be  present  in  all 
situations  where  learners  are  reading  to  prepare  for  an  examination.  The  experimental 
situation  also  consisted  of  students  reading  text  from  a  computer  screen  and  recording  their 
responses  with  a  computer  keyboard.  It  should  be  understood  however,  that  no  computer 
literacy  was  required  and  typing  skills  were  reduced  to  depressing  two  keys  on  the 
keyboard. 
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Also,  the  Xenograde  experiment  consisted  of  less  than  23  minutes  of  reading 
wherein  these  perceptions  could  materiahze,  and  it  is  for  this  reason  that  gi"Oup  differences 
on  the  criterion  examination  were  expected  to  be  nonsignificant.  However,  if  embedded 
questions  improve  perceptions  of  cognitive  readiness,  it  is  reasonable  to  expect  that  they 
will  improve  test  performance  in  longer,  more  natural  study  conditions,  although  this 
requires  empirical  confirmation. 

In  addition,  it  is  unlikely  that  the  college  subjects  in  this  study  had  any  interest  in 
either  learning  the  experimental  materials  or  doing  well  on  the  examination.  Although  there 
were  cues  in  the  stimulus  materials  that  were  designed  to  generate  ego  involvement,  it  is 
likely  that  the  only  source  of  subject  motivation  was  the  promise  of  extra  credit  points  in 
academic  classes  for  participating  in  the  expeiiment.  If  the  effects  of  embedded  questions 
can  be  discovered  under  these  conditions,  it  may  also  be  reasonable  to  expect  that  similar  or 
more  robust  effects  might  be  found  in  more  typical,  high-stakes  academic  reading 
situations. 

Although  it  is  assumed  that  overall  motivation  in  the  expeiiment  was  low,  there  was 
no  reliable  way  of  discriminating  among  gi'adations  of  subject  compUance  to  the  orientating 
instructions  to  "do  the  best  they  could".  This  fact  is  makes  the  inteipretation  of  calibration 
scores  potentially  problematic.  Specifically,  it  is  likely  that  some  readers  exerted  very  little 
or  no  effort  to  understand  the  experimental  text  and  yet  scored  higher  than  average 
calibration  scores;  these  subjects  never  intended  on  doing  well  on  the  experiment  and  as  a 
result  would  be  relatively  positive  (and  accurately  so)  that  they  do  not  understand. 
Whereas  this  problem  is  unconti-ollable  in  the  present  study,  it  is  anticipated  that  the  bias  it 
introduces  was  unsystematic  and  ultimately  does  not  threaten  the  basis  for  making 
generalizations  from  findings. 


CHAPTER  4 
RESULTS  AND  DISCUSSION 


The  puipose  of  this  study  was  to  examine  the  effects  of  two  factors,  embedded 
questions  and  text  lookbacks,  on  readers'  cahbration  of  comprehension.  Although  group 
differences  in  postdiction  judgments  of  calibration  (PJC)  were  included  in  this 
investigation,  the  principal  focus  was  reader  judgments  and  beliefs  before  testing 
(prediction  calibration),  for  these  are  the  subjective  beliefs  that  guide  sti"ategic  decision- 
making during  reading.  Two  measures  of  prediction  calibration  (PREP  and  PI)  were  used. 
Two  intellectual  ability  measures  (Vz  and  V3)  were  also  included  to  determine  if  the  effects 
under  investigation  were  vulnerable  to  individual  differences.  Results  from  the  ten  research 
hypotheses  will  be  presented  in  the  first  section  of  this  chapter.  A  more  detailed  discussion 
will  follow  in  a  second  section. 

Results 

The  primary  question  of  interest  in  this  experiment  was  whether  embedded 
questions  would  improve  prediction  measures  of  calibration.  The  perceived  need  for 
strategic  remedial  behavior  was  measured  using  the  PREP  instrument,  which,  when 
compared  with  objective  performance,  formulates  the  dependent  variable  in  the  first  two 
research  hypotheses.  Descriptive  statistics  are  presented  in  Table  4-1. 

1 .  The  mean  PREP  score  for  the  embedded  question  groups  will  be 
significantly  larger  than  the  mean  PREP  score  for  the  no-embedded  question  groups  at  the 
.05  level. 

2.  The  mean  PREP  score  for  the  lookback  groups  will  be  significantly  larger 
than  the  mean  PREP  score  for  the  no-lookback  groups  at  the  .05  level. 
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Table  4-1. 

Descriptive  Statistics  for  PREP  Score  by  Expeiimental  Condition 


Expeiimental  Group 

H 

M 

SD 

EQLB 

42 

9.48 

4.61 

EQNLB 

42 

9.56 

5.50 

NEQLB 

42 

6.25 

5.59 

NEQNLB 

42 

6.91 

5.56 

Summary  ANOVA  statistics  are  presented  in  4-2.  Although  there  was  some 
difference  between  embedded  question  conditions  as  a  function  of  the  lookback  option,  this 
two-way  interaction  was  not  significant,  t  (164)  =  .35,  p  <  .05.  Because  there  was  no 
interaction,  tests  of  main  effects  are  appropriate.  The  vaiiance  of  the  errors  at  aU  values  of 
the  predictor  variable  for  each  of  the  three  dependent  variables  (PREP,  PI,  PJC)  appears 
constant.  For  this  reason,  analysis  of  variance  (ANOVA)  will  be  used  instead  of  a  Brown- 
Forsythe  ANOVA  in  each  test  of  main-effects. 
Table  4-2. 
Two-way  ANOVA  Statistics  for  PREP  Scores  by  Experimental  Group 


Source 

df 

ss 

ms 

E 

EQ(A) 

1 

360.21 

360.21 

12.65* 

LB(B) 

1 

5.36 

5.36 

.19 

AB 

1 

3.43 

3.43 

.12 

Error 
h; T-: 

164 

4670.62 

•"  U  <  -05. 
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For  hypothesis  1,  the  mean  PREP  score  was  greater  for  the  EQ  group  (9.52)  than 
for  the  NEQ  group  (6.58),  a  mean  difference  of  2.94.  The  obtained  t  value  for  this  main 
effect  was  t  (164)  =  3.56,  indicating  a  significant  difference  between  means. 

For  hypothesis  2,  the  mean  PREP  score  was  smaller  for  the  LB  group  (7.87)  than 
for  the  NLB  group  (8.24),  a  mean  difference  of  .37.  The  obtained  t  value  for  this  main 
effect  was  t  (164)  =  .44,  indicating  a  non-significant  difference  between  means. 

Two-way  ANOVA  was  used  in  hypotheses  3  and  4  where  the  dependent  variable 
was  PI,  or  prediction  inaccuracy  score.  Descriptive  statistics  are  presented  in  Table  4-3. 

3.  The  mean  PI  score  for  the  embedded  question  groups  will  be  significantly 
larger  than  the  mean  PI  score  for  the  no-embedded  question  groups  at  the  .05  level. 

4.  The  mean  PI  score  for  the  lookback  groups  will  be  significantly  larger  than 
the  mean  PI  score  for  the  no-lookback  groups  at  the  .05  level. 

Table  4-3. 

Descriptive  Statistics  for  PI  Score  by  Experimental  Condition 


Experimental  Group 

N 

M 

$P 

EQLB 

42 

20.88 

13.91 

EQNLB 

42 

21.67 

16.37 

NEQLB 

42 

26.00 

18.24 

NEQNLB 

42 

26.17 

15.28 

The  two-way  interaction  between  embedded  question  conditions  as  a  function  of  the 
lookback  option  was  not  significant,  t  (164)  =  .14,  p  >  .05.  For  hypothesis  3,  the  mean  PI 
score  was  smaller,  and  hence  more  accurate,  for  the  EQ  group  (21.28)  than  for  the  NEQ 
group  (26.09),  a  mean  difference  of  4.81 .  The  obtained  t  value  for  this  main  effect  was 
t  (164)  =  1.94,  indicating  a  significant  difference  between  means.  For  hypothesis  4,  the 
mean  PI  score  was  smaller  for  the  LB  group  (23.44)  than  for  the  NLB  group  (23.92),  a 
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mean  difference  of  .48.  The  obtained  t  value  for  this  main  effect  was  t  (164)  =  .  19, 

indicating  a  non-significant  difference  between  means.  Summaiy  ANOVA  statistics  ai-e 

presented  in  Table  4-4. 

Table  4-4. 

Two-way  ANOVA  Statistics  for  PI  Scores  by  Experimental  Group 


Source 

df 

ss 

ms 

E 

EQ(A) 

1 

971.52 

971.52 

3.78* 

LB(B) 

1 

9.52 

9.52 

.04 

AB 

1 

4.02 

4.02 

.02 

Error 

164 

42121.57 

p<.05. 

The  purpose  of  the  next  two  hypotheses  was  to  test  the  impact  of  embedded 
questions  on  postdiction  judgments  of  calibration.  Descriptive  statistics  are  presented  in 
Table  4-5  and  summary  ANOVA  statistics  ai-e  presented  in  4-6. 

5 .  The  mean  PJC  score  for  the  embedded  question  gi'oups  will  be  significantly 
larger  than  the  mean  PI  score  for  the  no-embedded  question  groups  at  the  .05  level. 

6,  The  mean  PJC  score  for  the  lookback  gi-oups  will  be  significantly  larger 
than  the  mean  PI  score  for  the  no-lookback  groups  at  the  .05  level. 

Although  there  is  some  difference  between  embedded  question  conditions  as  a 
function  of  the  lookback  option,  this  two-way  interaction  was  not  significant,  t  (164)  = 
.86,  £  >  .05.  For  hypothesis  5,  the  mean  PJC  score  was  greater  for  the  EQ  group  (4.28) 
than  for  the  NEQ  group  (3.37),  a  mean  difference  of  .91.  The  obtained  t  value  for  this 
main  effect  was  t  (165)  =  1.08,  indicating  a  non-significant  difference  between  means. 
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Table  4-5. 

Descriptive  Statistics  for  PJC  Score  by  Expeiimental  Condition 


Experimental  Group 

E 

M 

SD 

EQLB 

42 

5.53 

6.45 

EQNLB 

42 

3.03 

5.55 

NEQLB 

42 

4.00 

6.01 

NEQNLB 

42 

2.73 

4.40 

Table  4-6. 

Two-way  ANOVA  Statistics  for  PJC  Scores  by  Experimental  Group 


Source 

df 

ss 

ms 

E 

EQ(A) 

1 

37.02 

37.02 

1.16 

LB(B) 

1 

143.60 

143.60 

4.48* 

AB 

1 

17.90 

17.90 

.56 

Error 

164 

5257.87 

£<.05. 

For  hypothesis  6,  the  mean  PJC  score  was  greater  for  the  LB  group  (4.77)  than  for 
the  NLB  group  (2.88),  a  mean  difference  of  1.89.  The  obtained  t  value  for  this  main  effect 
was  t  (165)  =  2.12,  indicating  a  significant  difference  between  means. 

Hypotheses  7  through  10  were  formulated  to  determine  if  there  were  any  aptitude  x 
treatment  interactions  between  the  two  levels  of  two  factors  (EQ  vs.  NEQ  and  LB  vs. 
NLB),  with  either  of  two  intellectual  measures  (Vz  and  V3),  when  predicting  individual 
differences  in  the  three  key  dependent  measures  (examination  performance,  PREP,  and 
PJC).  These  hypotheses  constitute  12  separate  tests  of  statistical  interaction.  Normally, 
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provisions  would  have  had  to  be  made  to  control  the  family  wise  error  rate,  however,  no 
interactions  were  discovered  even  without  making  this  adjustment.  Although  missing  data 
(34  missing  V3  and  Vz  values)  attenuated  the  relationship  among  the  intellectual  measures 
and  dependent  vaiiables,  it  is  not  likely  that  a  complete  data  set  would  have  changed  the 
outcome  of  the  tests.  The  only  interaction  that  approached  significance  was  using  verbal- 
educational  abiUty  in  the  prediction  of  PREP  (t  (133)  =  1.62,  p  >  .05)  in  the  EQ  vs.  NEQ 
comparison.  Higher  PREP  scores  were  associated  with  higher  verbal  scores  in  the  EQ 
conditions  whereas  slightly  lower  PREP  scores  were  associated  with  lower  verbal  scores  in 
the  NEQ  conditions.  Slopes  associated  with  each  of  the  12  tests  can  be  inspected  in 
Appendix  E.  There  were  no  meaningful  patterns  discernable  in  the  direction  of  the  slopes 
and  a  visual  inspection  of  the  regression  plots  confirmed  that  the  relationship  among  the 
intellectual  measures  and  the  dependent  variables  was  not  polynomial.  An  intercorrelational 
matrix  of  the  dependent  variables  can  be  also  be  inspected  in  Appendix  F. 

Discussion 
In  this  study,  subjective  behefs  and  judgments  regarding  cognitive  readiness  were 
more  accurate  in  the  groups  who  encountered  embedded  questions  than  those  who  did  not. 
A  main  effect  for  embedded  questions  was  found  with  both  the  PI  and  PREP  measures  of 
prediction  calibration.  There  were  no  statistical  differences  between  the  EQ  and  NEQ 
gi-oups  on  postdiction  judgments  of  calibration  (PJC),  however,  groups  who  were  allowed 
to  reinspect  tire  text  (LB)  did  have  statistically  more  accurate  PJC  scores  than  those  groups 
who  were  not  allowed  to  lookback.  The  effects  discovered  in  this  study  did  not  vary  as  a 
function  of  the  two  individual  differences  (Vz  and  V3)  that  were  measured.  Discussion 
will  now  focus  on  a  more  detailed  analysis  of  the  10  research  hypotheses. 
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Criterion  Test 

Because  of  the  brief  duration  of  the  experiment,  group  differences  on  the  criterion 
examination  were  not  anticipated.  Although  test  performance  was  superior  in  the 
embedded  questions  groups,  differences  were  not  significant.  A  two-way  analysis  of 
variance  (ANOVA)  was  used  to  test  for  the  presence  of  interactions  or  main  effects. 
Descriptive  statistics  are  presented  in  Table  4-7  and  summary  ANOVA  statistics  are 
presented  in  Table  4-8.  No  interaction  or  main  effects  were  present. 
Table  4-7. 
Descriptive  Statistics  of  Percent  Correct  by  Experimental  Condition 


Experimental  Group 

M 

M 

m 

EQLB 

42 

54.69 

19.01 

EQNLB 

42 

48.44 

18.21 

NEQLB 

42 

47.80 

18.20 

NEQNLB 

42 

45.37 

15.78 

Table  4-8. 


Two-way  ANOVA  Statistics  for  Percent  Correct  by  Experimental  Group 


Source 

df 

ss 

ms 

E 

EQ(A) 

1 

1040.02 

1040.02 

3.35 

LB(B) 

1 

788.67 

788.67 

2.54 

AB 

1 

152.38 

152.38 

.49 

EiTor 

164 

50861.76 
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Kuder  Richai-dson  20  was  used  to  calculate  reliability  for  this  19-item  test  (r  =  .69). 
Item  difficulties  ranged  from  relatively  easy  (89%  correct)  to  relatively  difficult  (15% 
correct),  and  the  average  item  difficulty  was  49%  correct  with  a  standai'd  deviation  of  19. 
Predicted  Con'ect 

The  PI  variable,  as  a  measure  of  prediction  calibration,  was  derived  by  taking  the 
absolute  value  of  the  difference  between  the  scores  readers  predicted  they  would  receive  on 
the  criterion  test  (subjective  judgements)  and  the  scores  they  ultimately  obtained  (objective 
measures).  Subjective  judgments  of  the  four  gi'oups  are  presented  in  Table  4-7. 
Table  4-9. 
Descriptive  Statistics  of  Predicted  Correct  by  Experimental  Condition 


Experimental  Group  H  M  StD. 


EQLB  42  72.81  19.41 

EQNLB  42  65.50  21.08 

NEQLB  42  71.14  19.12 

NEQNLB  42  70.88  15.83 


One  feature  immediately  discemable  when  comparing  Tables  4-7  and  4-9  is  the 
overconfidence  displayed  by  all  groups,  who,  on  average,  overestimated  how  well  they 
would  do  on  the  criterion  examination  by  21  percentage  points,  a  clear  example  of  poor 
calibration.  A  monotonic  pattern  is  maintained,  in  part,  between  the  subjective  and 
objective  estimates  of  peifonnance;  test  scores  were  highest  in  the  EQLB  group  and  so 
were  theii"  subjective  estimates  of  performance.  Likewise,  test  scores  were  lowest  in  the 
NEQNLB  group  and  so  were  their  subjective  estimates.  The  EQLB  and  NEQLB  scores 
were  juxtaposed  in  both  measures  and  changed  relative  positions  in  the  two  tables. 
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Prediction  Calibration 

Two  measures  of  prediction  calibration  were  used  in  this  study  (PI  and  PREP). 
The  first  question  subjects  were  asked  after  reading  the  text  was  to  estimate  the  score  they 
would  obtain  on  the  examination  (PI).  Immediately  after  responding  to  this  probe,  subjects 
were  asked  about  their  perceived  need  to  reread  at  the  20,  40,  60,  and  80%  criterion  level 
(PREP).  Although  encountering  embedded  questions  produced  main  effects  with  both 
measures,  the  effects  were  stronger  when  PREP  was  the  independent  variable  (d  =  .55) 
than  they  were  when  PI  was  used  as  the  independent  measure  (d  =  .30).  A  standardized 
mean  difference,  d,  was  used  to  compare  effect  sizes  because  the  standard  deviations 
between  the  two  variables  differed  greatly  and  natural  common  scale  is  interpretable  for  the 
PI  measure  but  not  the  PREP  measure  (Green  &  Hall,  1984). 

An  inspection  of  the  data  associated  with  these  two  variable  reveals  that  decisions 
;  associated  with  the  PREP  variable  were  generally  more  conservative  than,  and  often  in 

contradiction  with,  decisions  made  in  conjunction  with  the  PI  measure.  For  example,  49 
subjects  predicted  test  scores  of  81%  or  higher,  but  when  asked  on  the  PREP  variable  if 
they  thought  they  would  need  to  reread  to  obtain  a  score  of  80%,  16  of  these  49  subjects 
said  they  would.  At  the  60%  criterion  level,  124  subjects  predicted  test  scores  of  61%  or 
higher,  but  when  asked  on  the  PREP  variable  if  they  thought  they  would  need  to  reread  to 
obtain  a  score  of  60%  32  of  these  124  subjects  indicated  that  they  would.  In  this  same 
conservative  direction,  28  subjects  contradicted  themselves  at  the  40%  probe  and  17  at  the 
20%  probe.  Because  poor  calibration  is  generally  the  result  of  overconfidence,  the  more 
conservative  decisions  associated  with  the  PREP  measure  were  are  the  likely  source  of 
larger  effects  in  this  variable  as  compared  with  the  PI  measure. 

The  reason  why  more  conservative  decisions  were  made  in  connection  with  the 
PREP  measui-e  is  open  for  speculation.  Although  PI  and  PREP  are  conceptually  affiliated, 
psychologically  they  ask  subjects  to  make  somewhat  different  decisions;  in  the  PI  probe 
subjects  were  asked  to  make  a  subjective  estimate  of  what  they  think  they  could  do  on  the 
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test,  whereas  the  PREP  probes  asked  them  to  estimate  whether  strategic  behavior 
(rereading)  would  be  necessary  to  make  sure  they  could  reach  four  specific  criterion  levels. 
The  subjective  estimates  made  in  association  with  the  PI  measure,  although  reflecting  here- 
and-now  perceptions  of  cognitive  readiness,  were  probably  also  affected  by  the  reader's 

past  perfonnance--"!  am  the  type  of  person  who  generally  scores %  on  examinations." 

According  to  MacKenzie  (1989),  a  learners  best  estimate  of  performance,  in  the  absence  of 
other  stronger  cues,  will  be  their  mean  past  performance. 

The  probes  associated  with  PREP  and  PI  obviously  tapped  different  sources  of 
subjective  feelings  and  this  distinction  served  to  mitigate  the  empirical  relationship  between 
these  two  variables.  By  viewing  the  intercorrelations  presented  in  Appendix  F  the  reader 
will  notice  that  the  correlation  between  these  two  variables  was  statistically  significant  and 
in  the  predicted  direction,  however,  the  strength  of  the  relationship  was  relatively  small 
(r  =  -.39).  One  possible  explanation  was  that  PREP  is  both  a  measure  of  calibration  and 
confidence  whereas  the  PI  vaiiable  is  a  measure  of  calibration  only.  This  explanation  is 
unsupportable  however,  given  that  the  correlation  between  PI  and  the  calibration  portion  of 
PREP  alone  was  r  =  -.37. 

As  mentioned  in  the  previous  chapter,  an  analysis  was  undertaken  of  the  categorical 
decisions  associated  with  the  PREP  variable  alone.  In  the  PREP  measure,  readers  were 
asked  if  they  would  need  to  reread  (yes/no)  if  the  criteria  for  passing  the  examination  were 
20, 40,  60,  and  80%.  Under  such  an  airangement,  four  correct  decisions  ai'e  possible. 
Descriptive  statistics  for  number  of  correct  decisions  by  group  are  presented  in  Table  4-10 
and  summary  ANOVA  statistics  are  presented  in  Table  4-11. 
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Table  4-10. 

Descriptive  Statistics  of  Number  of  Con-ect  PREP  Decisions  by  Experimental  Condition 


Expeiimental  Group 


M 


m 


EQLB 
EQNLB 
NEQLB 
NEQNLB 


42 
42 
42 
42 


3.14 
3.24 
2.74 
2.81 


7.51 
7.60 
8.00 
8.62 


Table  4-11. 

Two-way  ANOVA  Statistics  for  Percent  Coirect  by  Experimental  Group 


Source 

df 

ss 

ms 

E 

EQ(A) 

1 

7.29 

7.29 

11.57* 

LB(B) 

1 

.29 

.29 

.46 

AB 

1 

.01 

.01 

.01 

Error 

164 

103.36 

g<.05. 

By  compai-ing  Table  4-2,  where  PREP  was  analyzed  using  both  the  calibration  and 
confidence  components  of  its  measure,  with  Table  4-10,  where  the  calibration  (categorical) 
decisions  are  considered  alone,  it  is  possible  to  discern  that  the  main  effects  for  the  EQ 
conditions  are  nearly  identical  regardless  of  approach.  The  standardized  main  effects  for 
EQ  in  the  two  component  method  was  d  =  .55;  with  calibration  decisions  considered  alone, 
the  effect  size  was  d  =  .53.  In  this  study  embedded  questions  improved  accuracy  of 
decision  making  but  had  litde  impact  on  how  readers  use  the  confidence  scale.  Moreover, 
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differences  in  effect  sizes  between  the  PI  and  PREP  variable  apparently  had  little  to  do  with 
the  confidence  component  of  the  PREP  measure. 
Postdiction  Calibration 

Hypothesized  postdiction  calibration  was  not  significantly  better  for  the  embedded 
questions  groups,  however,  the  mean  PJC  scores  were  significantly  better  in  the  LB 
conditions  when  compai-ed  with  the  NLB  conditions.  As  outlined  in  the  previous  chapter, 
PJC  is  like  the  PREP  variable  in  that  both  are  a  composite  of  calibration  and  confidence 
decisions.  For  each  of  the  19  items  on  the  criterion  examination  readers  were  asked  if  they 
answered  the  question  correctly  (yes/no).  These  yes/no  decisions  constitute  the  categorical 
portion  of  the  PJC  score,  and  when  compared  with  actual  test  perfoimance,  readers  can 
score  as  many  as  19  "hits "--predicting  their  answer  was  correct  when  it  was  correct  or 
predicting  their  answer  was  incorrect  when  it  was  incon-ect.  Descriptive  statistics  for  hits 
on  the  criterion  examination  are  presented  in  Table  4-12  and  summary  ANOVA  statistics 
ai^e  presented  in  Table  4-13. 
Table  4-12. 
Descriptive  Statistics  of  Number  of  Hits  by  Experimental  Condition 


Experimental  Group 

N 

M 

SD 

EQLB 

42 

12.05 

3.11 

EQNLB 

42 

11.10 

2.90 

NEQLB 

42 

11.57 

3.02 

NEQNLB 

42 

11.00 

2.53 

60 


Table  4-13. 

Two-way  ANOVA  Statistics  for  Hits  by  Experimental  Group 


Source 

df 

ss 

ms 

£ 

EQ(A) 

1 

3.43 

3.43 

.41 

LB(B) 

1 

24.38 

24.38 

2.91 

AB 

1 

1.52 

1.52 

.18 

Error 

164 

1375.81 

By  compaiing  Table  4-13  with  Table  4-6  it  is  possible  to  determine  that  the  LB 
main  effects  reported  as  part  of  hypothesis  5  disappear  when  the  calibration  portion  of  the 
PJC  measures  is  analyzed  alone.  One  explanation  may  be  that  being  able  to  lookback 
provided  readers  with  some,  as  yet  unspecified,  form  of  feedback  about  their 
understanding  that  made  their  confidence  scores  in  the  criterion  examination  more  accurate. 
Another  explanation  is  that  readers  in  the  LB  conditions  were  more  inclined  to  assign 
higher  confidence  scores  to  all  of  their  test  answers.  By  inspecting  Table  4-14  and  4-15  it 
is  possible  to  determine  that  this  is  what  happened.  The  LB  main  effect  was  almost 
statistically  significant  when  considering  the  categorical  decisions  alone  (hits).  A  greater 
number  of  hits  in  the  LB  conditions  associated  with  overall  higher  confidence  scores 
resulted  in  the  statistically  significant  main  effect  discovered  in  hypothesis  6.  It  is  an  error 
however,  to  assume  that  readers  in  the  LB  conditions  had  an  objective  reason  for  being 
more  confident.  Frequency  of  exposure,  m  the  present  case  looking  back,  is  tallied  at  an 
unconscious  level  and  has  been  found  to  increase  decision  confidence  without  regard  to  the 
information  value  of  the  feedback  that  is  received  (MacKenzie,  1989). 
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Table  4-14. 

Descriptive  Statistics  of  Average  Confidence  Scores  on  the  Criterion  Examination 


Expeiimental  Group 

M 

M 

SD 

EQLB 

42 

4.09 

1.02 

EQNLB 

42 

3.79 

.82 

NEQLB 

42 

3.88 

.97 

NEQNLB 

42 

3.61 

.77 

Table  4-15. 

Two-way  ANOVA  Statistics  for  Average  Confidence  Scores  on  the  Criterion  Examination 


Source 

df 

ss 

ms 

E 

EQ(A) 

1 

1.59 

1.59 

1.96 

LB(B) 

1 

3.49 

3.49 

4.31* 

AB 

1 

.01 

.01 

.01 

En-or 

164 

1375.81 

£<.05 

Strategic  Processing  Variables 

Data  were  collected  on  several  other  vaiiables  not  part  of  the  formal  reseai'ch 
hypotheses:  The  amount  of  time  subjects  took  to  take  the  Xenograde  examination,  the 
amount  of  time  they  spent  reading  the  text,  and  the  number  of  times  readers  turned  the 
pages  backwards  in  the  two  lookback  conditions. 

On  average,  subjects  spent  19.13  minutes  completing  the  19  items  in  the  Xenograde 
examination,  with  a  standard  deviation  of  4.9  minutes.  Descriptive  statistics  are  presented 
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in  Table  4-10  and  summary  ANOVA  statistics  are  presented  in  Table  4-11.  No  interaction 

or  main  effects  were  present. 

Table  4-10. 

Descriptive  Statistics  of  Minutes  on  Criterion  Examination  by  Experimental  Condition 


Expeiimental  Group 

M 

M 

SD. 

EQLB 

42 

19.17 

4.99 

EQNLB 

42 

18.44 

4.91 

NEQLB 

42 

19.31 

4.71 

NEQNLB 

42 

19.58 

5.01 

Table  4-11. 

Two-way  ANOVA  Statistics  for  Minutes  on  Criterion  Examination  by  Experimental  Group 


Source 

df 

ss 

ms 

F 

EQ(A) 

1 

17.49 

17.49 

.73 

LB(B) 

1 

2.19 

2.19 

.09 

AB 

1 

10.45 

10.45 

.43 

Error 

164 

3944.88 

In  comparison,  subjects  spent  an  average  of  15.22  minutes  reading  and  studying 
the  Xenograde  text.  Spending  less  time  preparing  for  an  examination  than  actually  taking  it 
is  probably  indicative  of  both  the  amount  of  interest  subjects  had  in  learning  the 
experimental  materials  and  overall  poor  calibration  of  test  readiness.  Descriptive  statistics 
for  reading  times  are  presented  in  Table  4- 12  and  summary  ANOVA  statistics  are  presented 
in  Table  4-13.  Although  there  was  some  difference  between  embedded  question  conditions 
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as  a  function  of  the  lookback  option,  this  two-way  interaction  was  not  significant.  Reading 

times  were  significantly  different  in  both  EQ  and  LB  tests  of  main  effects. 

Table  4-12. 

Descriptive  Statistics  of  Minutes  Reading  the  Text  by  Experimental  Groups 


Expeiimental  Group 

M 

M 

SD 

EQLB 

42 

18.26 

3.46 

EQNLB 

42 

16.24 

3.62 

NEQLB 

42 

13.41 

4.16 

NEQNLB 

42 

13.02 

2.73 

Table  4-13. 

Two-way  ANOVA  Statistics  for  Minutes  Reading  by  Expeiimental  Group 


Source 

df 

ss 

ms 

E 

EQ(A) 

1 

684.05 

684.05 

54.97* 

LB(B) 

1 

60.72 

60.72 

4.88* 

AB 

1 

28.34 

28.34 

2.28 

Error 

164 

2040.83 

£  <  .05. 

The  two  EQ  groups  were  engaged  in  reading  the  Xenograde  material  an  average  of 
4.04  minutes,  or  31%,  longer  than  the  two  NEQ  conditions.  However,  the  embedded 
questions  added  280  words  to  the  1700  word  Xenograde  text,  a  16%  increase  in  the 
amount  of  text  to  be  read.  Because  the  EQ  groups  spent  31%  more  time  reading  16%  more 
text,  embedded  questions  probably  altered  reader  perceptions  regarding  the  minimum 
amount  of  cognitive  effort  needed  to  comprehend  the  Xenograde  material. 
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Readers  pemiitted  to  lookback  averaged  1.21  more  minutes  reading  than  the  NLB 
groups,  and,  as  reported  in  Table  4-13,  this  difference  was  statistically  significant.  When 
pairwise  test  were  conducted  comparing  the  EQLB  group  with  the  EQNL  group  (t  (§2)  = 
2.62,  p  <  .05)  and  the  NEQLB  group  with  the  NEQNLB  group  (t  (g2)  =  .50,  p  >  .05)  it 
was  possible  to  discern  that  the  source  of  this  main  effect  was  the  with  the  first 
compaiison.  That  is  to  say,  having  the  option  to  look  back  changed  the  amount  of  time  on 
task  only  when  embedded  questions  were  present.  The  two-tailed  critical  value  to  be 
exceeded  in  both  of  these  comparisons  was  t  =  1.99,  approximately.  Apparently, 
embedded  questions  changed  reader  perceptions  about  the  minimum  amount  of  cognitive 
effort  that  was  needed  to  understand  the  expeiimental  material.  And,  when  given  the 
opportunity  to  remediate  their  understanding  through  the  lookback  option,  they  did. 

It  is  interesting  to  note  that  embedded  questions  did  not  significantly  increase  the 
number  of  times  that  readers  took  advantage  of  the  lookback  option  (t  (g2)  =  -82,  g  >.05). 
Descriptive  statistics  are  presented  in  Table  4-13. 
Table  4-13. 
Descriptive  Statistics  of  Number  of  Lookbacks  by  Experimental  Groups 


Subjects  were  asked  to  read  the  same  text  material  in  the  pilot  study  as  they  did  in 
the  present  experiment.  However,  in  the  pilot  subjects  were  given  all  of  the  time  they  felt 
necessary  to  read  and  study  the  text  material,  whereas  in  the  present  study  reading  time  was 
limited  to  21  minutes.  One  of  the  effects  of  this  design  change  was  to  suppress  the  number 
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of  times  that  readers  took  advantage  of  the  lookback  option  to  remediate  their 
understanding.  The  differences  in  standard  deviations  between  tlie  pilot  and  present  study 
support  the  conclusion  that  not  having  to  work  under  time  consti-aints  produced  greater 
vaiiabiUty  in  subject  behavior.    Descriptive  statistics  of  lookbacks  for  the  pilot  study  are 
presented  in  Table  4-14.  The  difference  between  these  means  are  statistically  significant 
a(98)  =  2.14,E  <.05). 
Table  4-14. 
Descriptive  Statistics  of  Number  of  Lookbacks  in  Pilot  Experiment 


CHAPTER  5 
SUMMARY  AND  CONCLUSIONS 


The  practical  and  theoretical  rationale  for  this  study  will  be  summarized  in  this 
chapter.  After  discussing  the  reseai'ch  design  and  implications  that  have  been  drawn  from 
the  findings  of  the  experiment,  recommendations  for  future  research  will  be  suggested. 

Summary 
The  main  purpose  of  this  study  was  to  determine  whether  questions  embedded  in 
expository  text  could  improve  the  con-espondence  between  adult  readers'  subjective 
assessments  of  test  readiness  and  objective  test  performance  (prediction  calibration). 
A  secondary  consideration  was  whether  embedded  questions  had  an  impact  on  postdiction 
judgments  of  performance  (postdiction  calibration).  In  similar  reseai'ch,  the  challenge  has 
been  to  externahze  these  complex  mental  events  so  they  can  be  studied.  Many  researchers 
have  attempted  to  understand  these  mechanisms  by  trying  to  make  a  connection  between 
what  readers  can  declare  about  the  workings  of  their  memory  with  subsequent  memory 
performance.  Results  using  this  approach  have  been  disappointing.  The  central  theoretical 
ai-gument  forwai^ded  in  this  study  was  that  this  tactic  was  inappropriate  for  studying 
perceptions  of  cognitive  readiness;  readers  may  not  be  able  to  give  accurate  accounts  of 
their  memory  processes  because  many  of  the  cognitive  events  associated  with  memory 
monitoring  reside  with  the  executive  processes.  Mental  events  at  the  executive  level  are  not 
always  available  to  conscious  awareness  and  therefore  not  easily  reportable.  For  these 
reasons,  readers  were  asked  to  make  comparative  decisions  about  their  perceptions  of  test 
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readiness  (PI  &  PREP).  This  tactic  captures  the  internal  workings  of  both  metacognitive 
knowledge  and  executive  processes  while  avoiding  the  problems  associated  with  other 
approaches-decisions  regarding  cognitive  readiness  reflect,  in  part,  a  reader's  declarative 
knowlege  concerning  readiness,  yet,  the  proceedure  is  not  dependent  upon  the  reader's 
ability  to  report  on  this  knowledge  while  still  capturing  the  outcomes  of  executive  decision 
making. 

Research  using  embedded  questions  for  the  exclusive  purpose  of  testing  their  effect 
on  calibration  of  test  readiness  has  been  undertaken  in  only  one  other  study  (Pressley  et  al., 
1987).  Pressley  et  al.  found  that  embedded  questions  improved  prediction  calibration,  and 
by  asking  readers  to  make  a  series  of  decisions  about  the  need  to  reread,  they  created  a 
calibration  measure  that  was  based  directiy  on  the  perceived  need  for  stiategic  remedial 
behavior  (PREP).  The  present  study  used  the  same  PREP  measure  but  altered  many  of  the 
experimental  conditions  to  detemiine  if  tiiese  findings  would  hold  under  conditions  more 
similar  to  the  academic  reading  situations  noi-mally  encountered  by  adult  learners. 

In  order  to  minimize  the  confounding  effects  of  prior  knowledge,  subjects  were 
asked  to  read  a  text  based  on  a  make-believe  solar  system.  This  material  was  relatively 
difficult,  and  concepts  in  the  text  were  cumulative  in  nature  as  would  be  found  in  a  college- 
level  text.  The  embedded  questions  used  were  high  order  questions-they  required  readers 
to  do  more  than  simply  retrieve  facmal  information.  The  effects  of  embedded  questions 
were  tested  in  both  a  lookback  and  no  lookback  condition.  The  no  lookback  condition 
made  it  possible  to  isolate  tiie  effects  of  embedded  questions  on  PREP  alone  from  the 
effects  of  embedded  questions  plus  any  possible  restudy  decisions.  It  may  be  that 
embedded  questions  change  perceptions  of  readiness  but  are  not  sufficient  to  ehcit 
accomanying  changes  of  behavior.  The  lookback  condition  makes  it  more  appropriate  to 
generahze  to  normal  academic  reading  situations.  In  addition,  this  study  included  two 
measures  of  intellectual  abilities  to  detemune  if  the  effects  under  investigation  were 
vulnerable  to  individual  differences. 
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Conclusions 

The  central  question  of  this  study  was  whether  embedded  question  could  bring 
subjective  judgments  of  test  readiness  in  closer  correspondence  with  an  objective 
assessment  of  test  perforraance-prediction  calibration.  As  hypothesized,  embedded 
questions  improved  prediction  judgments  of  calibration.  These  effects  were  discovered 
with  two  different,  but  conceptually  related,  measures  (PI  and  PREP).  With  the  PI 
variable,  readers  who  encountered  embedded  gave  more  accurate  evaluations  of  how  well 
they  would  do  on  the  criterion  test.  With  the  PREP  variable,  readers  who  encountered 
embedded  questions  had  more  accurate  perceptions  regai'ding  how  much  strategic  behavior 
(rereading)  would  be  necessary  to  reach  four  different  criterion  levels  of  performance.  The 
source  of  the  EQ  main  effects  on  the  PREP  variable  were  to  be  found  with  the  categorical 
decisions  and  not  with  the  confidence  portion  of  the  measure. 

Although  postdiction  judgments  of  cahbration  were  more  accurate  in  the  groups 
who  encountered  embedded  questions,  this  difference  was  not  statistically  significant.  The 
effects  of  embedded  questions  on  postdiction  judgment  were  likely  overshadowed  by  the 
dii-ect  feedback  readers  received  once  they  were  engaged  in  answering  the  actual  questions. 
The  lookback  factor  was  informative,  however,  with  regards  to  postdiction  judgments. 
The  main  effects  found  for  the  LB  conditions  on  postdiction  judgments  were  the  result  of 
both  more  accurate  categorical  decisions  (yes/no  the  answer  was  coixect)  and  overall  higher 
confidence  scores. 

Although  readers  in  the  NEQLB  group  used  the  lookback  option,  they  did  not 
spend  any  more  time  on  task  than  the  NEQNLB  group.  The  EQ  groups  spent  31%  more 
time  reading  16%  more  text  when  compared  with  the  NEQ  groups.  Clearly,  embedded 
questions  helped  readers  realize  that  greater  effort  was  requii'ed.  Moreover,  readers  who 
both  encountered  embedded  questions  and  were  given  the  option  to  remediate  their 
understanding  through  the  lookback  option,  did  lookback,  and  ultimately  spend  more  time 
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engaged  in  studying  the  experimental  materials.  Given  that  research  findings  consistently 
show  that  readers  generally  overestimate  their  sense  of  preparedness,  these  findings  are 
encouraging. 

The  effects  of  embedded  questions  must  be  considered  in  the  larger  backdrop  of  the 
experiment  as  a  whole.  In  the  present  study  subjects  took  less  time  to  read  the  text  than 
they  did  to  take  the  examination.  This  discovei^  was  taken  as  evidence  of  both  the  amount 
of  interest  subjects  had  in  learning  the  materials  and  their  overall  poor  calibration  of  test 
readiness.  The  short  duration  of  the  experiment,  moderate  reliability  of  the  criterion  test, 
and  overall  lack  of  motivation  on  the  pait  of  subjects,  certainly  had  a  mediating  effect  on  all 
findings  in  this  study.  These  forces,  combined  with  a  small  sample  and  homogeneous 
subject  pool,  also  worked  against  finding  die  hypothesized  aptitude  treatment  interactions. 

Generalizing  from  this  study  to  nonexperimental  conditions  seems  warranted  for 
two  reasons.  Subjects  in  this  study  were  not  cued  to  the  presence  of  embedded  questions 
nor  were  they  asked  to  answer  them.  Because  reading  for  remembering  is  a  learner- 
controlled  process,  and  because  the  effects  of  embedded  questions  ultimately  depends  on 
what  the  reader  does  with  them,  this  study  sought  to  detemaine  if  embedded  questions 
could  induce  a  spontaneous,  learner-produced  versus  investigator-induced  effect.  This 
design  feature  makes  the  present  study  particularly  unique  and  also  allows  for  the  greatest 
degi'ee  of  generahzability  to  nonexperimental  settings.  Second,  the  facilitating  effects  of 
embedded  questions  were  found  in  an  experiment  of  veiy  short  duration,  with  subjects 
whose  only  motivation  was  probably  to  get  through  the  experiment.  It  is  reasonable  to 
speculate  that  the  effects  of  embedded  questions  may  be  more  robust  under  conditions  more 
similar  to  the  types  of  academic  reading,  testing,  and  incentive  situations  experienced  by 
adults. 
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Implications 

The  face  of  education  was  changed  forever  with  the  advent  of  movable  type 
printing.  As  the  first  form  of  mass  communication,  books  put  more  knowledge  in  the 
hands  of  more  people  faster  and  cheaper  than  ever  before.  Today,  teachers  of  students  of 
all  ages  use  assigned  readings  as  one  of  the  key  components  in  their  arsenal  of  instixictional 
techniques.  These  readings  are  assigned  in  the  hope  that  information  will  be 
communicated,  contexts  understood,  and  preformed  assumptions  about  the  world 
confronted  and  challenged.  The  promise  of  reading  has  always  been  with  us  and  has 
significant  information-processing  advantages  over  the  spoken  word  in  that  readers  can 
elect  to  review  any  part  of  a  text  previously  read  but  no  longer  remembered. 
Unfortunately,  however,  the  full  promise  of  reading  is  seldom  realized.  The  evidence  is 
compelUng  that  many  readers  get  through  academic  courses  without  acquiring  a  clear- 
understanding  of  tlie  most  fundamental  aspects  of  the  material  the  text  is  intended  to 
communicate.  The  most  serious  problem  is  not  so  much  readers'  inherent  inability  to  read, 
but  rather  their  interaction  with  the  text.  Embedded  questions  have  a  rich  history  of 
assisting  learners  in  acquiring  concepts  and  principles  from  prose  passages.  What  the 
present  study  suggests  is  that  embedded  questions  can  also  be  used  to  change  the 
dysfunctional  interaction  many  readers  have  with  the  text. 

In  1929,  Alfred  North  Whitehead  declared  that  the  central  problem  of  all  education 
was  the  keeping  of  knowledge  alive,  of  preventing  it  from  becoming  "inert."  The  problem 
of  acquiring  inert  knowledge  and  the  passive  relationship  many  readers  have  with  text 
seems  at  the  very  core  of  the  many  difficulties  associated  with  academic  reading  and 
learning.  The  chai-acterization  of  readers  as  active  information  processors  styhstically 
disposed  to  "take  charge"  of  their  understanding  and  remembering  is  probably  in  error 
(Bransford,  1986).  Many  studen^are  rarely  challenged  or  engaged  by  what  they  read,  and 
even  less  often  do  they  challenge  the  text  or  their  own  understanding  of  what  they  have 
read.  In  effect,  they  are  passive  agents  with  "subdued  attitudes  of  reverence  to  text" 
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(Tiemey,  1982,  p.  100).  This  relationship  of  reader  to  text  is  contrary  to  the  goals  of 
educators  who  are  in  the  business  of  helping  students  to  reflect  and  better  understand  the 
nature  of  their  own  understanding.  Well  thought  out  embedded  questions  have  the 
potential  to  challenge  readers'  understanding  of  what  they  are  reading  while  they  are 
engaged  in  the  process  of  reading.  This  is  in  marked  conti-ast  to  current  practices  where  the 
readers'  first  test  of  their  understanding  is  at  the  time  of  formal  testing.  At  this  point,  of 
course,  the  proverbial  horse  is  out  of  the  bam  and  the  damage  is  done.  Readers  of  all  ages 
have  a  repertoire  of  strategies  they  can  employ  to  remediate  their  understanding.  However, 
they  have  no  reason  to  use  them  if  they  do  not  understand  that  they  do  not  understand. 

Undetected  cognitive  failure  during  reading  is  a  problem  well  documented  with 
young  readers,  and,  researchers  have  recently  established  that  even  adult  skilled  readers  are 
often  not  proficient  at  monitoring  their  cognition.  Developing  a  better  understanding  of  this 
problem  is  a  matter  of  some  urgency  since  reading  expository  prose  is  one  of  the  primary 
means  through  which  students  are  expected  to  acquire  knowledge  in  academic  settings  and 
perceptions  of  comprehension  and  cognitive  readiness  are  the  principal  determinants  of  the 
learning  strategies  readers  employ  and  the  amount  of  cognitive  resources  they  use. 
Research  findings  from  this  study  support  the  conclusion  that  embedded  questions  have  the 
effect  of  bringing  subjective  beliefs  regarding  test  readiness  in  better  cahbration  with 
objective  test  preparedness. 

Being  well  cahbrated  has  powerful  advantages  for  any  one  reading  and  studying 
session,  however,  meaningful  learning  is  cumulative  in  nature,  and  the  ability  to  learn  new 
material  is  highly  dependent  upon  prior  knowledge.  For  this  reason,  small  differences  in 
cognitive  monitoring  abihty  may  account  for  large  differences  in  academic  performance  if 
considered  over  the  course  several  or  more  school  years.  This  point  is  worthy  of  an 
extended  example.  In  a  series  of  experiments,  Owings  and  Peterson,  (1980)  asked  two 
groups  of  fifth-graders  (those  in  the  upper  quarter  of  the  class  and  those  in  the  bottom 
quaiter  of  the  class)  to  read  two  different  sets  of  short  stories,  one  precise  and  one  less 


; 


72 


precise.  The  stories  were  simplified  and  all  students  knew  the  individual  words  of  the 
stories.  The  stories  included  such  statements  as  "The  hungry  boy  ate  the  hamburger" 
(precise)  and  "The  sleepy  boy  ate  the  hamburger"  (less  precise).  The  findings  of  this  study 
were  reported  in  Bransford  (1979). 


Each  student  read  a  passage  and  studied  it  as  long  as  he  or  she  wished,  and 
!|  then  read  and  studied  another  passage.  (Each  pan-  of  passages  contained  one 

precision  and  one  less  precise  story.)  After  each  series  of  two  passages,  the 
students  were  asked  to  state  whether  one  seemed  harder  to  learn  than  the 
others  and  why.  They  then  received  a  test  on  each  passage;  "What  did  the 
hungry  boy  do?"  would  be  a  typical  test  question.  . .  The  upper-quartile 
students  could  readily  distinguish  precise  form  less  precise  passages. 
Furthermore,  they  studied  longer  for  the  harder,  less  precise  passages  than 
for  the  precise  passages.  In  contrast,  the  lower-quaitile  students  were  quite 
poor  at  distinguishing  between  the  two  types  of  passages  and,  although  they 
seemed  highly  motivated,  exhibited  absolutely  no  tendency  to  spend  longer 
studying  the  harder,  less  precise  passage,  (p.  200) 

Regardless  of  the  antecedent  causes  that  give  rise  to  poor  cognitive  monitoring, 
whether  they  are  some  deficit  in  general  intelligence  or  reading  skills,  the  consequences  are 
the  same;  those  student  in  the  gi^eatest  need  of  what  can  be  gained  by  studying  will  be  the 
least  likely  to  engage  themselves  when  the  academic  situation  demands  it  most.  After  a 
short  duration,  the  differences  between  these  two  groups  of  students  will  expand;  those 
who  had  poor  cognitive  monitoring  skills  will  also  have  an  impoverished  prior  knowledge 
upon  which  they  attempt  to  learn  new  material.  In  effect,  the  rich  will  get  richer  and  poor 
will  get  poorer.  In  the  present  study,  embedded  questions  positively  altered  reader 
perceptions  of  cognitive  readiness  in  a  reading  setting  that  lasted  less  than  a  half  an  hour. 
When  considering  their  effects  over  a  longer  duration,  embedded  questions  could  also 
serve  to  mitigate  the  cumulative  harm  that  results  from  poor  cognitive  monitoring. 

If  cahbration  plays  an  important  and  positive  role  in  the  process  of  reading  and 
understanding,  it  is  logically  imperative  that  classroom  teachers  teach  for  cahbration. 
Readers  must  be  taught  to  self-question  and  self-cue  to  bring  forth  information  relevant  to 
metacognitive  control.  Not  only  must  administrators  select  texts  with  embedded  questions 
when  appropriate,  but  teachers  must  endeavor  to  assist  students  in  using  them  and  help 
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them  to  employ  the  cognitive  resources  necessaiy  to  improve  calibration  and  increase  their 
understanding.  Embedded  questions  can  be  thought  of  as  a  prompt,  to  do  what  those  who 
monitor  their  cognition  well  do  without  prompting,  that  is,  to  ask  themselves  questions  as  a 
test  of  their  understanding  of  the  text.  If  used  often  enough,  and  under  conditions  were 
there  is  a  legitimate  connection  between  successfully  answering  the  embedded  questions 
and  doing  well  on  the  criterion  test,  it  is  reasonable  to  believe  that  embedded  questions 
could  be  used  to  help  students  internahze  the  powerful  cognitive  skill  of  self  questioning 
and  take  more  conti'ol  of  their  own  learning. 

Li  our  country  today  we  subject  students  to  a  myriad  of  standardized,  objective 
examinations,  results  from  which  play  an  impoitant  role  in  their  future  education  and 
subsequent  careers.  Test  makers  may  want  to  consider  the  confounding  effect  of  poor 
cognitive  monitoring  on  tests  that  puiport  to  measure  scholastic  achievement  or  intellectual 
ability  or  that  claim  a  high  correlation  to  intelligence  measures.  The  presence  of  embedded 
questions  in  passages  of  text  present  in  these  high-stakes  tests  could  trigger  the  cognitive 
resources  that  lead  to  strategic  behaviors  necessary  to  reduce  this  confounding.  Although 
not  the  focus  of  the  present  study,  it  is  important  to  point  out  that  calibration  failure,  in  a 
formal  experiment  or  normal  classroom  setting,  can  also  result  from  poor  test  construction. 

In  addition  to  being  a  remedy  for  poor  calibration,  embedded  questions  can  also 
serve  as  a  research  prototype  to  externalize  and  study  the  complex  mental  events  associated 
with  comprehension  and  memory  monitoring.  What  is  needed  are  techniques  that  can 
assist  the  process  of  understanding  why  some  readers  are  better  able  to  monitor  their 
cognition  than  others.  Furnished  witli  an  understanding  of  these  individual  differences,  it 
may  then  be  possible  to  design  instruction  to  remediate  poor  cognitive  monitoring  directly. 
The  processes  of  asking  why  embedded  questions  of  one  type  or  kind  improve  cognitive 
monitoring  in  some  learners  and  not  in  others  may  give  expeiimenters  one  more  foothold 
for  understanding  the  architecture  of  human  cognition.  From  such  an  investigation,  a 
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taxonomy  of  questions,  and  questioning  processes,  might  be  developed  that  is  more  firmly 
grounded  on  empirical  reseai'ch  than  is  currently  available. 


Recommendations 

In  the  present  study  embedded  questions  positively  altered  perceptions  of  cognitive 
readiness  and  had  the  effect  of  making  readers  better  calibrated.  The  logical  next  step  is  to 
determine  how  these  effects  are  pi-oduced.  Embedded  questions  may  provide  feedback  that 
readers  can  use  to  make  adjustments  in  their  judgments  of  cognitive  readiness.  They  may 
also  act  as  a  prosthetic  device,  triggering  readers  to  evaluate  their  comprehension  where 
they  may  not  have  done  so  on  their  own.  However,  the  important  point  is  that  success  in 
isolating  any  of  these  cognitive  processes  will  depend  on  more  powerful  research  designs, 
more  sensitive  measures,  and  data  collection  taken  from  real-world  academic  settings. 
These  are  obtainable  goals.  The  software  shell-program  that  administered  and  collected 
data  in  the  present  study  is  completely  capable  of  collecting  similar  data  from  a  semester 
length  collection  of  readings.  Semester- length  research  would  allow  for  both  more  stable 
measurements  and  more  powerful  within-group  designs. 

With  an  extended  course  of  learning  under  investigation,  embedded  questions  could 
be  repeated  in  the  criterion  examination.  This  design  feature  allows  for  several  interesting 
questions  to  be  tested.  For  example,  do  readers  who  encounter  embedded  questions 
actually  answer  them,  what  is  the  relationship  between  being  able  to  answer  the  embedded 
questions  and  performance  on  the  test  as  a  whole,  and  are  there  particular  types  of 
embedded  questions  that  seem  to  help  particulai-  types  of  learners  but  not  others.  Having 
the  flexibility  to  alter  the  types  and  amounts  of  embedded  questions  and  then  repeat  them  in 
the  criterion  examination  provides  for  a  means  to  partition  a  lai-ge  number  of  potentially 
meaningful  sources  of  vaiiance. 


75 


After  watching  many  subjects  interact  with  the  experimental  material  in  the  present 
experiment,  this  expeiimenter  believes  that  a  measure  of  cognitive  tempo  may  be  an 
individual  difference  that  might  be  infonnative  in  future  research.  Some  readers  may  be 
poorly  cahbrated  because  they  lack  deliberateness  in  testing  then-  understanding  (Kagan  & 
Kogan,  1970),  whereas  others  might  be  more  accurately  described  as  defensive  and 
anxious,  and  choose  to  escape  the  stressful  act  of  evaluating  their  understanding  by  making 
quick  decisions  about  their  state  of  cognitive  readiness  (Wapner  &  Conner,  1986).  With  an 
extended  course  of  learning  under  investigation,  numerous  measures  of  individual 
differences  could  be  tested  and  the  relationship  between  aptitudes  and  perfoiTnance  at 
different  stages  of  learning  would  be  possible.  Ultimately,  however,  what  is  needed  are 
techniques  and  measui"es  of  on-line  cognitive  processing— cognitive  measures  taken  at  the 
moment  of  learning. 

Having  a  proxy  measure  of  the  amount  of  cognitive  effort  a  reader  was  expending 
would  be  particularly  useful.  Again,  this  is  possible  in  an  investigation  of  extended 
learning  with  the  software  shell  program  designed  for  the  present  study.  McCormick  and 
Pressley  (1989)  made  the  convincing  argument  that  attention  is  best  thought  of  as  having 
volume.  One  reader  may  take  twice  as  long  as  another  reader  to  study  a  portion  of  text; 
however,  the  mental  energy  expended  by  a  second  reader  may  be  twice  that  of  the  first. 
Combining  the  dimensions  of  intensity  and  duration  to  form  a  single  volume-of-attention 
measure  would  be  useful  for  understanding  strategy  use  during  reading.  For  example, 
witli  this  measure  it  would  be  possible  deteimine  if  either  learning  efficiency  or  individual 
differences  in  cognitive  monitoring  are  related  to  the  selective  attention  strategy-being 
better  able  to  selectively  attend  to  and  encode  the  more  important  information  in  the  text 
while  decreasing  attention  to  unimportant  information. 

The  softwai-e  shell  program  developed  for  the  present  study  is  able  to  measure  both 
dimensions  of  intensity  and  duration.  The  progi-am  is  capable  of  measuring  and  keeping 
track  of  reader  movement  through  a  text  while  simultaneously  keeping  track  of  time  on  task 
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within  tlie  order  one  and  one-sixtieth  of  a  second.  A  proxy  measure  of  cognitive  effort  may 
be  formulated  by  using  the  secondary  task  pai-adigm  and  measuring  latency  scores.  Before 
readers  are  engaged  in  reading  the  expeiimental  text,  they  may  be  put  through  a  set 
orientating  instructions  where  they  are  taught  to  depress  a  key  (spacebar,  mouse)  whenever 
they  hear  a  pai'ticulai-  tone.  Data  generated  from  these  initial  sessions  could  be  used  to  form 
base-rate  comparison  data  on  subject  reaction  tunes.  Once  readers  are  formally  engaged  in 
reading  and  studying  the  text  these  tones  could  be  used  again,  and  the  time  between  the 
initiation  of  the  tone  and  the  subject's  response  could  be  used  as  a  proxy  measure  of 
cognitive  capacity  engagement-the  longer  it  takes  subjects  to  respond  to  the  secondary  task 
the  greater  the  amount  of  effort  they  invested  in  the  primary  task.  The  secondary  task 
paradigm  has  been  demonstrated  as  both  a  valid  and  reliable  proxy  measure  of  cognitive 
effort  (Greeno,  1980).  A  measure  of  cognitve  effort  would  also  be  useful  for  investigating 
expert/novice  differences  in  calibration;  the  amount  of  effort  required  to  process  text 
decreases  as  expertise  increases  and  both  are  likely  to  have  an  impact  on  perceptions  of 
cognitive  readiness. 

It  is  the  proper  role  and  function  of  research  to  advance  the  nature  of  knowledge 
and  expand  the  boundaries  of  understanding.  It  brings  to  these  tasks  a  rigorous  pursuit  of 
truth  and  a  scrupulous  regard  for  accuracy.  It  seeks,  in  essence,  that  knowledge  and 
understanding  be  carefully  cahbrated.  Tliis  study  was  an  effort  to  increase  this  caUbration 
and  advance  our  understanding  of  the  cognitive  processes  so  important  to  a  richer  and  more 
profound  appreciation  of  the  promises  of  the  human  mind. 


APPENDIX  A 
ORIENTATION  INSTRUCTIONS 

The  following  instnactions  were  read  to  participant  by  the  proctor  before  they  began 

studying  the  Xenograde  Science  Systems: 

You  are  about  to  read  and  study  a  short  textbook  that  describes  an  imaginary 
solar  system  called  Xenograde  Science.  The  reading  is  composed  of  less 
than  35  pages  of  material.  Your  objective  is  to  read  and  study  the  material 
to  the  very  best  of  your  ability.  Unlike  normal  textbook  reading,  the 
amount  of  time  that  you  can  spend  on  this  text  is  limited  to  23  minutes.  The 
amount  of  time  that  you  have  remaining  will  be  presented  in  the  upper  left 
hand  portion  of  the  computer  screen.  When  you  have  finished  the  study 
portion  of  this  experiment  you  will  be  asked  to  respond  to  a  short 
questionnaire  that  will  ask  you  some  questions  about  yourself  as  a  learner. 
Once  you  have  finished  the  questionnaire  the  computer  will  route  you  to  the 
19- item  test.  In  this  test  you  will  be  asked  to  apply  the  principles  you  have 
learned  about  the  Xenograde  solar  system.  If  you  finish  reading  and 
studying  the  Xenograde  text  before  tlie  23  minutes  has  expired  you  are  free 
to  go  on  to  the  questionnaii^e  portion  of  the  study.  Simply  go  to  the  end  of 
the  text  and  then  mm  the  page  one  additional  time.  However,  be  reminded 
that  your  objective  is  to  do  the  very  best  you  can  on  the  Xenograde 
examination.  Any  questions.  You  may  begin. 

At  this  point  in  the  experiment  subjects  were  instructed  to  read  the  introductory 

statement  on  the  computer  temiinal  (See  Appendix  B). 
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APPENDIX  B 
,    Xenograde  Science  Text 


The  first  eight  (8)  cards  are  encountered  by  all  participants.  The  remaining  cards 
represent  the  text  as  seen  bv  subjects  in  an  embedded  question  experimental  condition. 
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Welcome  to  the  Xenograde  experiment.  Your  participation  is 
greatly  appreciated.  This  study  will  address  questions  of  human 
learning.  With  your  assistance,  this  study  may  contribute  to  our 
collective  understanding.       Thank  you. 

ORIENTATION 


The  computer  screen  that  you  are  now  looking  at  has  been 
designed  to  look  and  act  as  much  as  possible  like  a  text  book. 
Movement  forward   and  backward,   or,   turning   pages   forward  and 
backward  is  done  with  the  two  arrows  on  the  keyboard.       The 
pointing  hands  at  the  bottom  right  corner  of  the  page  will  let  you 
know  what  directions  are  possible.     To  go  forward  press  the 
forward  arrow   (return  key).     If  a  hand  is  pointing  backwards 
you  will  be  able  to  press  the  back  arrow  to  move  to  earlier  text. 
If  you  are  ready,  GO  FORWARD.  .  . ■ > 


SW>';>™^\w^ 


About  this   study   and  the   "Science  of  Xenograde -Systems"! 


In  the  pages  to  follow  you  will  learn  about  a  fictitious  science 
system.     No  one  could  have  studied  such  a  subject  so  all 
participants  start  out  on  equal  ground.     Your  task  is  to  learn  the 
material  to  the  best  of  your  ability,     The  experiment  consist  of  a 
series  of  tasks.     After  reading  and  studying  the  Xenograde 
Science  text  you  will  be  asked  several  questions  about  yourself 
as  a  learner.     You  will  then  take  an  examination  on  this  material. 

This  experiment  takes  approximately  50  minutes  and  the 
time  you  can  spend  reading  the  Xenograde  text  is  timed.     The 
amount  of  time  you  have  remaining  will  be  indicated  at  the  top 
right  hand  portion  of  the  page  (where  the  pages  are  bent  down). 
Once  you  have  started  the  task  it  is  important  to  keep  your 
concentration  focused.      The  validity   of  the  experiment  depends 
on  your  earnest  concentration. 
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lings   you   should   know   as  a  participant  in   this   study 


Daniel  Lofald  is  the  principal  investigator  in  this  study  and  the 
only  person  who  will  know  your  name  as  a  participant.  Daniel  is  a 
doctorate   student  with   the  Foundations   of  Education   Department. 
He  will  be  glad  to  answer  any  questions  you  may  have  and  can  be 
contacted  through  the  Foundations  office  at  1403  NRN. 

Initially,  your  name  will  be  linked  to  the  random  SUBJECTID# 
you  have  been  issued.     Your  name  will  be  permanently  erased 
from  all  files  as  soon  as  it  has  been  determined  that  the  correct 
subject  number  has  been  registered  in  each  of  the  sections  of  this 
experiment.     The  only  reason  your  name  would  be  needed  is  if 
something   unforseen   happened   in   the   computer   program   that 
resulted  in  a  loss  of  data.     At  all  times  your  anonymity  as  a 
participant  will  be  protected. 


Regretably,  no  financial  compensation  can  be-  offered  in 
exchange  for  your  participation  in  this  study.     Participation  is 
strictly  voluntary.     Please  be  informed  that  you  are  free  to 
discontinue  participation  in  this  project  at  any  time  without 
prejudice. 

Proceed  with  this  study  only  if  you  feel  you  understand  what 
is  expected  of  you  and  are  comfortable  with  your  role  as  a 
participant. 

Feel  free  to  go  back  and  review  any  of  the  pages  of  these 
instructions  or  call  Dan  over  (do  not  ask  the  question  so  other 
subjects  can  hear  you)  and  ask  him  any  questions  at  this  time. 


If  you  are  ready  to  proceed,  press  the  forward  arrow. 
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Do  you  have  your  small  card  with  your  SUBJECTID#  ? 
If  not,  the  computer  will  not  allow  you  to  proceed.     If  you  have 
not  already  done  so  make  that  card  handy. 

Got  it?    Great! 

Before  we  go  on  practice  going  backwards  in  this  computerized 
textbook.  Use  the  arrows  on  the  keyboard  and  turn  back  a  page 
or  two  and  then  come  back  to  this  page.     Go  ahead,  give  it  a  try! 

If  you  have  made  it  back  to  this  card  I  guess  that  we  can  assume 
that  you  know  how  to  turn  the  pages  in  this  book. 

Turn  a  page  forward  whenever  you   are  ready. 


In  this  experiment  you  will  use  the  arrows  on  the  keyboard 
to  turn  the  pages  and  the  mouse  is  used  to  enter  data  (don't 
panic).     In  the  columns  below  you  will  enter  your  SUBJECT  ID#. 
You  will  notice  that  they  look  much  like  the  bubble  sheets  for 
registration.     Same  principle,  bring  the  mouse  indicator  (small 
arrow)  over  the  appropriate  bubble  and  click  the  mouse. 

Oo  Oo  Oo  Oo  Oo  Oo 


If  you  make  a  mistake 
just  re-click  in  the 
appropriate  number. 


When  this  number  matches 
your  number  press  the  forward 
key. 


Oi  Oi  01  Oi  01  Oi 

SUBJECT  ID*  §3  §3  §3 

040404 

05  05  05 

06  06  06 
O?  07  O"? 


> 
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Lesson  1:       Overview 

Xenograde   Systems  are   very  small   imaginary   systems   similar  in 

structure  to  an  atom  or;the  solar  system.     As  in  the  atom,  the 

center  body  of  the  Xenograde  System  is  called  the  NUCLEUS.     In 

each  Xenograde  System  there  are  one  or  more  bodies  revolving 

around  die  nucleus.     Each  of  these  bodies  are  called  a 

SATELLITE. 


^'^      nucleus 


® 


satellite 


y  / 
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The  nucleus  of  a  Xenograde  System  contams  tiny  particles 
called  ALPHONS.    These  alphons  may  be  in  the  very  center  of  the 
nucleus;  that  is  in  the  INNER  REGION,  or  they  may  be  against  the 
shell  of  the  nucleus;  that  is  at  the  OUTER  SHELL, 
outer  shell 

alphons 


mner   region 


(2) 


8  4 


8  5 


While  the  alphons  are  breathing  inside  the  nucleus,  the 
satellites  are   moving  around  the  nucleus   in  circular  paths   called 
orbits.     However,  when  .'the  Xenograde  System  is  placed  m  a 
magnetic  field  the  satellites  no  longer  travel  in  circular  orbits 
but  travel  in  irregular  orbits.     That  is,  when  first  placed  in  a 
magnetic  field  the  satellites  begin  to  move  toward  the  nucleus 
until  they  collide  with  the  nucleus  and  then  they  move  out  until 
they  reach  their  original  orbit  and  then  back  to   the  nucleus  and 
so  on. 


Nucleus 


Satellite 


Original  Orbit 


(5) 


Can  you  correctly   identify   the  elements   labeled  in  small   letters? 
For  the  system  below,  how  many  alphons  would  be  at  the  outer 
shell  at  time-6?     What  is  this  inward  and  outward  migration  of 
alphons  called?     Is  this  Xenograde  System  in  a  magnetic  field? 


Time-0 
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When  a  satellite  collides  with  the  nucleus  as  the  alphons  are 
migrating  outward   it  may   piclf-up   the   alphon   from   the  nucleus 
as  indicated  in  this  diagram.     ', 

Collision 
no  pick-up 


Nucleus- 


Picking  up  an  alphon 
from  the  nucleus 
causes  the  satellite 
to  increase  velocity. 


Satellite 
Original  Orbit- 


Collision  + 
alphon  pick-up 

During  the  inward  migration  of  alphons,  satellites  may  drop  off 
or  leave  alphons  in  the  nucleus.     When  this  happens  the  speed  or 
velocity  of  the  satellite  decreases  and  it  travels  more  slowly. 


(6) 


Alphon  pick-up  by  the  satellites  only  occurs  during  the 
outward  migration  of  alphons  and  alphon  drop  off  only  occurs 
during  the  inward  migration  of  alphons.     Since  satellites  only 
collide  with  the  nucleus  when  a  Xenograde  System  is  in  the 
magnetic  field  alphons  can  only  be  exchanged  when  the  system 
is  in  a  magnetic  field 


Time-0 


Time-1 


Time-5 


Time-7 


(7) 
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Lesson  2:     Bre 

Remember  that  alphons  migrate  one  by  one  from  the  inner 
region  of  the  nucleus  to  its  outer  shell  (exhale  phase).     After  all 
of  the  alphons  have  migrated  to  the  outer  shell  they  reverse 
direction  and  migrate  back  to  the  inner  region  (inhale  phase). 
These  two  phases  are  said  to  be  one  breathing  cycle.     The  time 
between  the  migration  of  one  alphon  and  the  migration  of  the 
next  alphon  is  constant  for  any  given  Xenograde  System.     That  is, 
the  length  of  time  between  the  migration  of  the  first  alphon  and 
the  second  alphon  is  exactly  the  same  length  of  time  as  that 
between  the  migration  of  the  second  alphon  and  of  the  third,  and 
the  fourth,  etc.     This  period  of  time  between  two  successive 
alphon  migrations  is  an  alphon  second.     Time  for  Xenograde 
Systems  are  always  given  in  alphon  seconds. 

(9) 
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When  we  first  put  a  system  in  a  magnetic  field  before 
alphons  have  migrated  we  call  the  time,  time-0  or  t-0  for  short. 
The  total  number  of  alphons  in  the  nucleus  of  a  system  is  called 
the  alphon  number.     This,  is  the  means  by  which  different 
Xenograde   systems   are   distinguished   from   one   another.      When 
speaking  of  a  system  we   write   Xenograde-8,   which   means   this 
system  has  8  alphons  in  the  nucleus  under  normal  conditons. 
For  example: 


A  Xenograde-8 


(10) 


If  a  nucleus  contains  eight  particles  and  the  nuniber  of  particles 
in  each  region  of  the  nucleus  is  counted  each  time  a  particle 
migrates,  how  many  particles  would  be  in  the  inner  region  and 
how  many  particles  would  be  at  the  outer  shell  on  the  12th  such 
count?     (Assume  that  all  of  the  particles  were  in  the  center  of  the 
nucleus   at   time-0) 
Two  conditions  are  necessary  for  satellites  to  be  able  to  drop  off 

an  alphon.  What  are  they? 

If  a  satellite  picks  up  an  alphon  when  it  collides  with  the  nuclues 

will  the  next  collision  occur  sooner  or  later  than  normal? 

Is  it  possible  to  determine  what  breathing  phase  is  represented 

by  the  below  diagram  or  how  many  alphon  seconds  have 

passed? 
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The  use  of  diagrams  to  illustrate  breathing  in  a  Xenograde 
System  is  a  rather  inefficient  way  of  illustrating  data.     Another 
way  is  to  present  the  reading,s  of  our  measuring  instruments  in 
table.     The  following  is  a  table  of  readings  for  a  Xenograde-3 
system. 


1.  t  stands  for  time  (given  in  alpon 


2.  Inner  means  the  #  of  alphons  in  the 
inner  region. 

3.  Outer  means  the  #  of  aphons  in  the 
outer  region. 

In  this  system  the  next  inhale  phase  would  begin  between 
t-6  and  t-7  and  the  next  exhale  phase  would  begin  at  t-10. 


t         Inner 

Outer 

seconds) 

0         3 

0 

1          2 

1 

2         1 

2 

3          0 

3 

4          1 

2 

t 

Inner 

Outer 

3 

.6 

3 

4 

5 

4 

5 

4 

5 

6 

3 

6 

7 

2 

7 

8 

1 

8 

9 

0 

9 
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Lesson  3:     Satellites 

While  breathing  is  going,  on  inside  the  nucleus  three  satellite 
bodies  are  revolving   around  the  nucleus.      Under  normal 
conditions  satellites  move  -in  circular  paths  called  orbits.     The 
path  radius  is  the  distince  from  the  edge  of  the  nucleus  to  the 
orbit  of  a  satellite.     This  distance  is  measured  by  small  units 
called  microns.     In  the  diagram  below  satellite  A,  B,  and  C,  are  2, 
3,   and  4  microns   aways  from  the  nucleus  respectively. 


B 


(13) 


You  will  recall  that  when  Xenograde  Systems  are  placed  in  a 
magnetic  field,  instead  of  traveling  in  circular  orbits  the 
satellites  begin  traveling  in  irregular  orbits.     When  first  placed 
in  a  magnetic  field  each  satellte  begins  to  move  toward  the 
nucleus  until  it  collides  with  the  nucleus.     Then  the  satellite 
moves  away  from  the  nucleus  until  it  reaches  its  original 
distance  from  the  nucleus  (the  distance  of  its  original  orbit).     As 
long  as  the  system  is  in  a  magnetic  field,  the  satellite  continues 
to  bounce  back  and  forth  between  its  original  orbit  and  the 
nucleus. 


(14) 
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An  electronic  instrument  is  used  to  record  positions  of  the 
satellites.     This  instrument  is  triggered  by  alphon  migration  and 
therefore  records   the  position,  of  each   satellite  every   alphon 
second.     The  satellite's  position  are  indicated  by  flashes  on  a 
sensitive  piece  of  graph  paper.     A  different  symbol  is  used  to 
designate  each  of  the  satellites  shown  below.     The  horizontal 
dimension  represents   time   in   alphon   seconds;   the   vertical 
dimension  represents  the  distance  of  satellites  from  die  nucleus. 


^     qA. 


1 


-Tidr 


0 


Time  in  alphon  sec. 


t 


M 


8 


When  first  placed  in  a  magnetic  field  does  the  satellite  move  first 
toward  the  nucleus  or  away  from  the  n«leus?     How  many 
^icron  seconds  will  transpire  before  S(     )  moves  towards  it|s 
~  ^okigirlal  otrbit 


Tinne  in  alphon  sec 
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When  the  system  is  in  a  magnetic  field  satellites  move  in  and 
out  and  collide  with  the  nucleus.     These  collisions  are  recorded 
on  the  graph  paper  as  blips.    Each  blip  on  the  record  is  a  collision 
of  a  satellite  with  the  nucleus.     A  continuous  pen  line  is  recorded 
on  the  same  graph  paper  that  contains  the  satellite  positions. 
Everytime  a  collision  occurs  the  pen  line  jogs  or  blips.     This 
record  is   therefore  continuous   rather  than   only   recorded   once 
each  alphon  second  like  satellite  positions. 


For  example: 
penline       , 


.ALA. 


blips^ 


A 


^ 


(16) 


v<        /^ 


The  position  of  the  blip  therefore  indicates  the  exact  time  when 
the  blip  occured.     The  first  blip  on  the  graph  occiared  at  about  .4 
of  an  alphon  second  after  t-1  or  at  t  1.4.     Based  on  this  graph  6 
collisions  occured  with  the  nucleus  between  t-2  and  t-8 
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When  placed  in  a  magnetic  field  satellites  not'  only  change 
their  orbits  from  circular  to  spiral  but  they  also  have  a  charge. 
This  charge  is  measured  in  units  called  volts.     The  size  of  the 
charge  is  related  to  the  speed  or  velocity  with  which  the 
satellites  travel.     The  higher  the  charge  the  faster  the  satellite 
moves  in  and  out  between  the  nucleus  and  its  original  orbit.     In 
other  words  the  higher  the  charge  the  greater  number  of  trips 
the  satellite  makes  between  its  original  orbit  and  the  nucleus  in 
a  given  period  of  time.     Usually  when  reporting  the  charge  of  a 
satellite  we  use  a  signed  number.     That  is,  we  say  a  satellite  has 
a  charge  of  +3  or  -3  (positive  three  or  negative  three),  etc.     This 
sign  (the  +  or  -)  indicates  direction  of  travel.     If  a  satellite  is 
traveling  toward  the  nucleus  it  is  assigned  a  +  (positive)  sign,  if 
it  is  moving  away  from  the  nucleus  it  is  assigned  a  -  (negative) 
sign. 

(19) 
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When  a  Xenograde  System  is  placed  in  a  magnetic  field,  all 
satellites  begin  to  move  toward  the  nucleus,  so  they  all  have  a  + 
(Positive)  charge  first.    -This  charge  changes  to  -  (Negative)  when 
a  satellite  changes  direction  to  move  away  from  the  nucleus 
following  a  collision. 

The  value  or  magnitude  of  a  satellite's  charge  changes  when 
alphon  pick-up  or  drop-off  takes  place.     That  is,  when  a  satellite 
collides  with  the  nucleus  and  picks  up  an  alphon  its  charge 
increases;  when  a  satellite  collides  with  the  nucleus  and  drops 
off  or  leaves  an  alphon  its  charge  decreases. 


(21) 
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The  sign  and  the  value  of  the  charge  is  recorded  next  to  each 

satellite  position  or  flash.     At  t-3  satellite  (  Q  has  an  *  instead  of 

a  charge  value.    This  is  called  a  blurred  flash.    A  blurred  flash 

reading  will  occur  at  the  exact  moment  at  which  the  satellite 

collides  with  the  nucleus  or  when  the  satellite  reached  its 

original  orbit, 
penline 


+5. 


"+3 


6+2 
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^?-6 


i}*- 


i-4. 


v-6 
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Alphon  seconds 
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Velocity  is  a  measure  of  the  number  of  microns  a  satellite 
travels  in  one  alphon  second.  ,  Since  we  know  that  charge  is 
related  to  the  velocity  a  satellite  travels  we  should  be  able  to  use 
charge  to  determine  velocity.     For  example,  since  S(  O)  moved  2 
microns  in  one  alphon  second  we  would  say  that  its  velocity  is  2 
microns  per  alphon  second, 
penllne 


.+5. 


7  ■»'■ 


6 -6+2 


-+3 
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iK 
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6-4. 


<H-3 


v-6 
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3  4  5 

Time  in  alphon  seconds. 
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To  find  the  total  distance  traveled  by  a  satellite  it's  necessary 
to  observe  its  direction  of  travel  (sign)  and  to  note  when  it 
changes  direction  at  its -original  orbit  or  at  the  nucleus.     We 
observed  in  the  last  problem  that  between  t3  and  t4  S(  •)  moved 
out  one  micron  to  its  original  orbit  and  then  back  2  more  microns. 
Hence,  the  satellite  has  a  velocity  of  3  microns  per  second. 


penline 
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If  satellites  collided  with  a  a  Xenograde-4  sys.tem,  (i.e.,  4 
alphons  in  nucleus  at  t-0)  at  t-6,  would  the  subsequent  speed  of 
the  satellite  increase,  decrease,  or  is  it  impossible  to  tell  from 
this    information? 

What  would  be  the  velocity  of  a  satellite  which  had  a  charge 
of    3+    at  t-2  and    4-    at  t-3? 

What  breathing  phase  would  a  Xenograde  system  be  in  if 
satellites   slowed   after   impacting   with   the   nucleus? 
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PI  AND  PREP  QUESTIONNAIRE 
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WELCOME  TO  THE  QUESTIONNAIRE  PORTION  OF  THE  STUDY 

Thff  central  purpose  of  this  experiment  is  to  try  to  determine  why  some 
adults  are  better  predictors  of  their  own  learning  than  others.     Iii  several 
questions  to  follow  you  will  be  asked  to  make  comparative  judgements  about 
yourself  as  a  learner.     Consider  the  questions  carefully  and  try 
to  give  as  an  accurate  appraisal  as  you  can.       Thanksl 


1) 


Tens  Ones 

00  Oo 

01  Oi 

In  the  next  minute  or  so  you  will  be  taking  a  test  over  the  O  2  O  2 

material  you  have  just  studied.  In  this  test  you  will  be  asked  03  0  3 
to  make  predictions  about  the  movement  of  Xenograde  systems  O  4  O  4 
basedupon  the  principles  you  have  studied.  What  is  your  best  05  05 
estimate  of  how  well  you  will  do  on  this  examination. 


Expressed  as  a  S  correct  score. 


When  you  have  made  your  prediction  press  this  button- 
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07  07 

08  08 

09  09 
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PREP   (Perceived  Readiness   for  Examination   Perfomance) 

The  next  four  questions  asks  you  whether  you  think  you  would  or  would 
not  need  to  reread  in  order  to  get  20%,  40%,  60%,  80%  on  the  forthcomming 
Xenograde  Science  test.     In  addition,  you  are  asked  to  quantify  you  confidence 
in  your  (would/would  not)  choice.     The  percent  correct  you  predicted  on  the 
previous  question  was--       ?    %. 

1)  Suppose  that  "passing"  on  the  test  were  20^.  Do  you  think  that  you 


would 
would  not 


need  to  reread  (that  is,  read  it  a  second  time) 


in  order  to  get  20%  correct  on  the  Xenograde  examination? 


la)  How  confident  are  you  that  you 


have  to  reread  to  get  20% 


o 

Not  sure 
at  all 


o 

Somewhat 
certain,  but 
some  doubt 
about  the  choice 


o 

Moderately 
certain,  only 
a  little  doubt 
about  the  choice 


o 

Very  certain, 
almost  no 
doubt  about 
the  choice. 


O 

Absolutely 

certain,  no 
doubt  about 
the  choice 

(NeHt?)| 
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PREP   (Perceived   Readiness   for  Examination   Perfomance) 


NEXT  PROBE  for  40% 


!, 


2)  Suppose  that  "passing"  on  the  test  were  40%.  Do  you  think  that  you 

would         O    Select  one 

would  not   O     option  need  to  reread  (that  is,  read  it  a  second  time) 

in  order  to  get  40%  correct  on  the  Xenograde  examination? 


2a)  How  confident  are  you  that  you 


have  to  reread  to  get  40% 


o 

Not  sure 
at  all 


Somewhat 
certain,  but 
some  doubt 
about  the  choice 


.  o 

Moderately 
certain,  only 
a  little  doubt 
about  the  choice 


O 

Very  certain, 

almost  no 
doubt  about 
the  choice. 


O 

Absolutely 

certain,  no 
doubt  about 
the  choice 
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PREP   (Perceived   Readiness   for  Examination   Perfomance) 


NEXT  PROBE  for  60% 


3)  Suppose  that  "passing"  on  the  test  were 60^.  Do  you  think  that  you 


would 
would  not 


O 


o 


Select  one 
option 


need  to  reread  (that  is,  read  it  a  second  time) 


in  order  to  get  60^  correct  on  the  Xenograde  exannination? 


3a)  How  confident  are  you  that  you 


have  to  reread  to  get  60^ 


O 

Not  sure 
at  all 


O 

Somewhat 
certain,  but 
some  doubt 
about  the  choice 


O 

Moderately 
certain,  only 
a  little  doubt 
about  the  choice 


O 

Very  certain, 
almost  no 
doubt  about 
the  choice. 


o 

Absolutely 
certain,  no 
doubt  about 
the  choice 
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PREP   (Perceived   Readiness   for   Examination   Perfomance) 
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LAST  PREP  PROBE  for  80% 


4)  Suppose  that  "passing"  on  the  test  were  80^.  Do  you  think  that  you 

I  lumimjjuujjD iHomm 

would 

need  to  reread  (that  is,  read  it  a  second  time) 


O 


Select  one 
option 


J 


would  not 

in  order  to  get  80^  correct  on  the  Xenograde  examination? 

4a)  How  confident  are  you  that  you have  to  reread  to  get  80% 


O 

Not  sure 
at  all 


O 

Somewhat 
certain,  but 
some  doubt 
about  the  choice 


,  O 

Moderately 
certain,  only 
a  little  doubt 
about  the  choice 


O 

Very  certain, 
almost  no 
doubt  about 
the  choice. 


O 

Absolutely 

certain,  no 
doubt  about 
the  choice 
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APPENDIX  D 
XENOGRADE  TEST-A  SAMPLE 
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Question#3:       The  folloujing  diagrams  illustrate  the  nucleus  at 
uarious  times  during  breathing.  The  systems  are  not  in  a  magnetic 
field. 

e  B  ^  c 

Diagram  a  Diagram  b  Diagram  c 

© 

2  alphon  seconds  7  alphon  seconds         8  alphon  seconds 

Determine  uuhich  diagrams  are  correctly  labeled.  Choose  the 
ansLuer  that  correctly  identifies  luhat  diagram(s)  are  correct  and 
Luhat  diagram(s)  are  not. 


a  and  c  correct;  b  incorrect- 
fa  and  c  correct;  a  incorrect- 
b  correct;  a  and  c  incorrect- 
Rli  diagrams  are  correct — 


fi 
■B 
■C 
■D 


Answer 


Low 


Confidence 


OflOBOCODOVesONo     0102030^0506 


NeHt? 


101 


102 


1)1111111 
iiiii'i'i'i'i'i'i' 


i^iiiiiiii 


Question*^6:    Belou  are  seueral  Henograde  records.   liJhich  of  these 
records  demonstrates  normal  breathing? 
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D  =  none  of  these  are  correct. 
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Question*?:  The  diagrams  beloiK  illustrate  the  motion  of  satellites 
iwhen  a  Henograde  Sgstem  is  placed  in  a  magnetic  field.  UJhich  of 
these  diagrams  best  illustrates  the  motion  of  a  satellite  in  a 
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magnetic  field.  Enter    (A) 


VA  , 

path  of 
satellite 


Enter  (B) 
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fpath  of 
satellite 
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Placed  in  magnetic 
field  at  this  time 
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satellite 
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APPENDIX  E 
REGRESSION  PARAMETERS  OF  INTERACTION  TESTS 

The  general  form  of  the  regression  model  specified  below  was  used  for  each 

interaction  test. 


Y(Dep.  Variable)  =  a  -i-  blXl(Group  Membership)  +  b2X2(inteUectual  Measure) 
-I-  b3XlX2(interaction  Parameter)  +  £• 


Test  of  Interactions  Between  EO  and  NEO  Groups  with  Visualization  (Vz) 


Dependent 
Measure 

Slope  of  Group 
EQ                NEQ 

t-value 

p-value 

%  Actual  Conrect 

1.260 

2.318 

1.34 

.18 

PREP 

-.029 

-.014 

.06 

.95 

PJC 

.232 

.499 

.98 

.33 

PI 

.513 

-.647 

1.46 

.15 

Note.  EQ  Group  coded  1  and  the  NEQ  Group  coded  0. 


Test  of  Interactions  Between  EO  and  NEO  Groups  with  Verbal  Abihty  rV3) 


Dependent 
Measure 

Slope 
EQ 

of  Group 
NEQ 

t-value 

g-value 

%  Actual  Correct 

.321 

.794 

1.13 

.26 

PREP 

.126 

-.082 

1.62 

.11 

PJC 

.149 

.162 

.10 

.92 

PI 

-.155 

-.447 

.79 

.43 

Note.  EQ  Group  coded  1  and  the  NEQ  Group  coded  0. 
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Test  of  Interactions  Between  LN  and  NLB  Groups  with  Visualization  CVz) 


Dependent 

Measure 

Slope 
LB 

of  Group 
NLB 

t-value 

p-value 

%  Actual  Correct 

1.644 

1.720 

.09 

.93 

PREP 

-.112 

-.005 

.40 

.69 

PJC 

.333 

.356 

.08 

.93 

PI 

.121 

-.132 

.31 

.76 

Note.  LB  Group  coded  1  and  the  NLB  Group  coded  0 


Test  of  Interactions  Between  LB  and  NLB  Groups  with  Verbal  Abihty  (V3) 


Dependent 
Measure 

Slope  of  Group 
LB                NLB 

t-value 

2-value 

%  Actual  Correct 

.700 

.394 

.73 

.47 

PREP 

.034 

.120 

.65 

.52 

PJC 

.163 

.151 

.09 

.93 

PI 

-.415 

-.220 

.48 

.63 

Note.  LB  Group  coded  1  and  the  NLB  Group  coded  0 


APPENDIX  F 
INTERCORRELATIONS  OF  DEPENDENT  VARIABLES 


PREP 


PREP 


PI 


%  PREP         PJC 

PJC      Correct    Calibration  Only 


1.0 

-.39 

.31 

.23 

86 

.22 

PI 

1.0 

-.52 

-.55 

-.37 

-.53 

PJC 

1.0 

.75 

.26 

.86 

%  Correct 

1.0 

.21 

.65 

PREP 

(Calibration 
portion  only) 

1.0 

.23 

(Cali 
port! 

PJC 

bration 
on  only) 

1.0 

Note.  The  intercorrelation  (r  = )  of  aU  variables  are  statistically  significant  at  ^  <  -01. 
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